Open xiaoyu-work opened 3 weeks ago
vLLM supports local models, but the file structure should follow that of a standard HuggingFace model repo.
Do you know how can I convert pt model or onnx model to huggingface model format? Seems I need to register it to huggingface first: https://discuss.huggingface.co/t/convert-pytorch-model-to-huggingface-transformer/16965
How would you like to use vllm
Does vLLM support huggingface local pytorch pt model or onnx model? How can I load them in both offline python code and OpenAI Completions API? I can see an error saying
config.json
file was not found. Does vllm only support vanilla hf model?