Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
I downloaded the SFR-Embedding-Mistralmodel weight file from huggingface. I put the model weights in the /home/hw/embedding_and_rerank_model directory and then started the container on machine which ip is xxxx , with the following command
docker run -v /home/hw/embedding_and_rerank_model:/root/models -p 9998:9997 --gpus all xprobe/xinference:latest xinference-local -H 0.0.0.0
After that, I went to http://xxxx:9998/ui and registered the SFR-Embedding-Mistral model. And then I launched the model.
Then I used langchain to access http://xxxx:9998/v1/embeddings/SFR-Embedding-Mistral. Everything is OK. The embedding model worked well.
And then I accidentally wrote the model name wrong, the SFR-Embedding-Mistral name was misspelled as SFR-Embedding-MistraT. Then I used langchain to access http://xxxx:9998/v1/embeddings/SFR-Embedding-MistraT. Of course, I didn't have a normal access model. I then changed the name to the correct one (http://xxxx:9998/v1/embeddings/SFR-Embedding-Mistral)and reconnected the model. At this point, the unexpected happened: I couldn't access the model properly, even though I filled in the API correctly. I tried to access the embedding model with Dify, and it also failed.
Accessing a wrong model can cause the entire Docker environment to crash.
Describe the bug
I downloaded the
SFR-Embedding-Mistral
model weight file from huggingface. I put the model weights in the/home/hw/embedding_and_rerank_model
directory and then started the container on machine which ip isxxxx
, with the following commanddocker run -v /home/hw/embedding_and_rerank_model:/root/models -p 9998:9997 --gpus all xprobe/xinference:latest xinference-local -H 0.0.0.0
After that, I went to http://xxxx:9998/ui and registered theSFR-Embedding-Mistral
model. And then I launched the model. Then I used langchain to accesshttp://xxxx:9998/v1/embeddings/SFR-Embedding-Mistral
. Everything is OK. The embedding model worked well. And then I accidentally wrote the model name wrong, theSFR-Embedding-Mistral
name was misspelled asSFR-Embedding-MistraT
. Then I used langchain to accesshttp://xxxx:9998/v1/embeddings/SFR-Embedding-MistraT
. Of course, I didn't have a normal access model. I then changed the name to the correct one (http://xxxx:9998/v1/embeddings/SFR-Embedding-Mistral
)and reconnected the model. At this point, the unexpected happened: I couldn't access the model properly, even though I filled in the API correctly. I tried to access the embedding model with Dify, and it also failed.Accessing a wrong model can cause the entire Docker environment to crash.
To Reproduce
Expected behavior
Accessing a wrong model can not cause the entire Docker environment to crash. After the model name is corected, the model can still be invoked.
Additional context
As far as I can tell, the wrong model was started using the terminal command line, and then the docker mapped port does not exist.