meta-llama / llama-recipes

Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
12.07k stars 1.93k forks source link

FSDP Inference #213

Closed TZeng20 closed 1 month ago

TZeng20 commented 1 year ago

Hi,

In the inference scripts, I see that there is no option to perform inference with FSDP.

Is model.generate not recommended when it is wrapped in FSDP? Or DDP?

Thanks

HamidShojanazeri commented 1 year ago

@TZeng20 yes, inferece with FSDP is not recommended due to the allgather call before each forward pass. Mostly we have explored TGI and VLLM as suggested here.

hbin0701 commented 1 year ago

Hi, I'm also facing this issue. My models are saved like "__0_0.distcp", "__1_0.distcp"... and so on. How can I load these models so that I can run model.generate?