Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Is your feature request related to a problem? Please describe
The entire repo of model on huggingface has been downloaded which wasted a lot of storage and time.
Describe the solution you'd like
An optional required file list should be maintain for each model to save time and storage.
Describe alternatives you've considered
Additional context