Shenyi-Z / ToCa

Accelerating Diffusion Transformers with Token-wise Feature Caching
MIT License
27 stars 1 forks source link

Questions about cache memory occupation. #3

Closed Edwardmark closed 1 month ago

Edwardmark commented 1 month ago

Thanks for your great work! It seems that toca need to keep cache of all blocks of all tokens, it will need too much memory if the layers and the tokens number is big. Is that true?

Shenyi-Z commented 1 month ago

Thanks for focusing on our work! You are right; the problem does exist. We have recently found that memory occupation can be reduced by quantizing. 8-bit quantization seems not to affect the performance of OpenSora-ToCa.

Edwardmark commented 1 month ago

We have recently found that memory occupation can be reduced by quantizing. 8-bit quantization seems not to affect the performance of OpenSora-ToCa.

So what is the memory required for OpenSora-ToCa for your setting?

Shenyi-Z commented 1 month ago

We haven't systematically tested it yet, but a single 80GB A800 is sufficient for inference.

Shenyi-Z commented 1 month ago

Hi, we have tested and it seems 54GB is needed if no quantization is done. (for 2s 480p video generation)

Edwardmark commented 1 month ago

Hi, we have tested and it seems 54GB is needed if no quantization is done. (for 2s 480p video generation)

thanks.