Open samos123 opened 1 month ago
Thanks for the feedback. This feature is currently not supported, but we have added it to our roadmap to simplify. Some models (such as LLama variants) need explicit acknowledgements from Meta's site before you can use them.
That can be handled by respecting HF_TOKEN environment variable to automatically download auth gated models. That's how vLLM and other OSS does it.
I should be able to serve a model by simply providing the HuggingFace model ID. Requiring users to convert checkpoints is too troublesome.