What can we do to customize larger GPUs?

I am trying to use this with a large audio input (3.5 hours or so).

Since the GPU it uses is fixed, replicate.com fails with:

Prediction failed for an unknown reason. It might have run out of memory (exitcode -9)?

I’m assuming that this means we need larger GPU machines. Is there anything I can do to customize this so it works with large inputs?

meronym / speaker-transcription