FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.
Apache License 2.0
9.18k stars 548 forks source link

How to use the model that has already been downloaded? #127

Open AntonioZC666 opened 9 months ago

AntonioZC666 commented 9 months ago

Hello, I'm trying to use FlexGen. I would like to know if I can run the project with a model that I have already downloaded? I'm using the case command line python3 -m flexgen.flex_opt --model facebook/opt-1.3b and it automatically downloads the model from Huggingface, but I've already downloaded it myself in advance. I tried using other parameter such as '--path' and adding the file path where my model is located, but it didn't work.

I just want to use a model that I've already downloaded, how should I do that?

AntonioZC666 commented 9 months ago

And I searched for the relevant Q&A and I found an argument --local from PR #111 I used a command python3 -m flexgen.flex_opt --model /mnt/data/share/llm/models/hf/opt-1.3b --local, but it didn't work. The error is flex_opt.py: error: unrecognized arguments: --local. How can I make it?

XiaomingXu1995 commented 2 weeks ago

I have the same question. Is there any suggestion to use local model?