Is possible an inference with just a RTX 3080 of 10GB?

declare-lab / tango

A family of diffusion models for text-to-audio generation.

https://tango2-web.github.io/

Other

1.09k stars 88 forks source link

Is possible an inference with just a RTX 3080 of 10GB? #25

Open davidmartinrius opened 1 year ago

davidmartinrius commented 1 year ago

Hello,

I know it is very little memory, but it is what I have by now.

By default, the demo code won't inference because of cuda out of memory. I tried to reduce the batch size of the inference to just 1, but is not enough.

Do you know a way to reduce the memory consumption running the inference?

I know that the best solution is to upgrade the GPU to a RTX 3090/4090/A6000, but before that I would like to try another way if possible.

Thank you!

David Martin Rius

deepanwayx commented 1 year ago

The required VRAM is around 13GB for full precision inference with a batch size of 1

You can also try Colaboratory for inference: https://github.com/declare-lab/tango/discussions/10

illtellyoulater commented 1 year ago

@deepanwayx I suppose full inference precision is 32 bit, correct? If so, did you guys made any test to check whether 16 bit would still deliver good acceptable results?

deepanwayx commented 1 year ago

Yes, the full inference precision is 32-bit. We did not test with 16-bit inference.