Example server.py file which allows users to serve local Hugging Face models downloaded on their machine to localhost:8000/completions. Currently you would have to write a import_and_return_<model> function for the inference logic, but that can usually be found on the model's page on Hugging Face.
Example
server.py
file which allows users to serve local Hugging Face models downloaded on their machine to localhost:8000/completions. Currently you would have to write aimport_and_return_<model>
function for the inference logic, but that can usually be found on the model's page on Hugging Face.