baaivision / Emu

Emu Series: Generative Multimodal Models from BAAI
https://baaivision.github.io/emu2/
Apache License 2.0
1.66k stars 86 forks source link

Weight files take up too much disk space #47

Closed VladAndronik closed 10 months ago

VladAndronik commented 11 months ago

Trying to run your demo, but getting No space left on device error, while loading the model, it takes more than 60GB of memory, on your HF it seems like there are 15 10GB files, are they all needed?

Thanks!

ryanzhangfan commented 10 months ago

Unfortunately YES. Emu2 is a model with 37 billion parameters, requiring approximately 138GB of memory under float32 precision, due to its sheer size. The huggingface version of Emu2-Chat and Emu2 are stored under the float32 precision.

We have just released the native PyTorch version of models in bf16 precision, requiring only 70GB of memory. You can try it out by following the instruction.

p.s. The native PyTorch version model is not compatible with the huggingface version. Please use the latest codes in this repo to load them.

VladAndronik commented 10 months ago

Thank you, so I would also need 70GB of video memory? Would it be possible to load it into 24GB with quantization?

ryanzhangfan commented 10 months ago

Please follow the quantization instruction. This requires approximately 22GB RAM and 22GB VRAM.

VladAndronik commented 10 months ago

Did you investigate if it affects performance too much, additional hallucinations?

ryanzhangfan commented 10 months ago

Regarding the impact of quantization on performance, we have not conducted thorough verification. Based on a few cases we have tested, the quantized model does not output as detailed answers as the original model. However, overall, the outputs are still correct.

VladAndronik commented 10 months ago

Thank you for the response!