Closed jkstyle2 closed 8 months ago
Hi,
Inferencing and fine-tuning mainly depend on memory and not GPU type. And the GPU memory requirement depends on the setting, for example, data type and fine-tuning methods. Typically, for inferencing, 60G GPU memory is enough. You can use some quantization methods to further reduce GPU requirements. For fine-tuning, we use 8 A100 GPUs, but if you use LoRA for tuning the LLM, I think one A100 GPU should work.
Best, Runsen
hello, I want to know if there is any hardware specification for your model inference. Are there any minimum requirement for inference or fine-tuning your model in terms of GPU, memory and etc?