Closed Tangjj1996 closed 1 year ago
The CPU isn't all that critical, two NVIDIA RTX 3090s with 24G CUDA memory should be able to handle loading and inferencing with fp16
precision.
The time it takes to download the model is largely contingent upon your network speed, while the time to load the model from local storage is determined by your disk speed.
Given that the model size is approximately 60GB, you should be able to estimate the time required on your own.
Tutorial reference: https://huggingface.co/docs/transformers/autoclass_tutorial
Thank you very much for your contribution. It's seems like an impressive work!
I'm a beginner in learning about large models. Conld you please let me know how long the installation of this model ususlly takes and the required CPU and GPU resources? Additionally, if you have any relevant learning tutorials or resources, that would be greatly appreciated.
Thanks!