Endpoint disabled for this model by API configuration 500

mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference

https://localai.io

MIT License

23.75k stars 1.82k forks source link

Endpoint disabled for this model by API configuration 500 #575

Closed mike-niemand closed 1 year ago

mike-niemand commented 1 year ago

LocalAI version: 1.18.0

Environment, CPU architecture, OS, and Version: Linux 78b4ecbb1b9f 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021 x86_64 GNU/Linux

Describe the bug

I am using the langchain-chroma example.

I have Local AI composed and running on Fiber. I am trying to create the storage for it. It creates the /db and sends the embeddings to the Local AI instance which I can see gets them. Local AI then throws a 500: 'endpoint disabled for this model by API configuration'

Expected behavior Expected to complete succesfully

Logs DEBUG:openai:message='OpenAI API response' path=http://127.0.0.1:8080/v1/engines/text-embedding-ada-002/embeddings processing_ms=None request_id=None response_code=500 INFO:openai:error_code=500 error_message='endpoint disabled for this model by API configuration' error_param=None error_type= message='OpenAI API error received' stream_error=False WARNING:langchain.embeddings.openai:Retrying langchain.embeddings.openai.embed_with_retry.._embed_with_retry in 4.0 seconds as it raised APIError: endpoint disabled for this model by API configuration {"error":{"code":500,"message":"endpoint disabled for this model by API configuration","type":""}} 500 {'error': {'code': 500, 'message': 'endpoint disabled for this model by API configuration', 'type': ''}} {'Date': 'Mon, 12 Jun 2023 11:42:50 GMT', 'Content-Type': 'application/json', 'Content-Length': '98'}.

neversettle7 commented 1 year ago

Same issue on macOS 13.4 build 22F66 - LocalAI version 1.18.0

mike-niemand commented 1 year ago

It's odd because I have spent hours trying to find the issue and that normally means it's something stupid. That error doesn't exist. Possibly permissions but I have mine set correctly and have tried them wide open.

neversettle7 commented 1 year ago

It's odd because I have spent hours trying to find the issue and that normally means it's something stupid. That error doesn't exist. Possibly permissions but I have mine set correctly and have tried them wide open.

Yeah I've been having this issue for 4 days now and I was going crazy, as there's no mention of that error in any documentation, not even on OpenAI's official docs for their python library. I also tried changing model with no improvements, still same issue.

mike-niemand commented 1 year ago

Only been 1 day so far for me. I have also posted on Discord so hope someone there will know.

Edit: Are you building in docker-compose?

neversettle7 commented 1 year ago

I am not, it's a local build on M1 Pro following the instructions here. I was following the guide on mudler's blog - which is the same as the one you were following - and I got stuck on this issue.

mike-niemand commented 1 year ago

Ummm. Hold a minute let me try something.

mike-niemand commented 1 year ago

Sorry @neversettle7 I can herewith confirm we are idiots ;) ...but thanks for pointing me in the right direction.

From Mudler's blog:

Note: The example contains a models folder with the configuration for gpt4all and the embeddings models already prepared. LocalAI will map gpt4all to gpt-3.5-turbo model, and bert to the embeddings endpoints.

Copy those files into your AI's /models directory and it works.

mike-niemand commented 1 year ago

Tested.

The solution to getting Mudler's fantastic langchain-chroma working correctly is to make sure you read his instructions as detailed above.

neversettle7 commented 1 year ago

I know for sure I'm an idiot and it's likely something really dumb that I'm doing wrong, but I followed the instructions (even repeated them) and I still have the same issue.

This is what I have in my examples/langchain-chroma folder:

- langchain-chroma
 |- models
  |- bert
  |- completion.tmpl
  |- embeddings.yaml
  |- ggml-gpt4all-j
  |- gpt-3.5-turbo.yaml
  |- gpt4all.tmpl
 |- docker-compose.yml
 |- query.py
 |- README.md
 |- requirements.txt
 |- state_of_the_union.txt
 |- store.py

The only difference between mine and mudler's code is in line 21 of store.py where I add openai_api_base and openai_api_key as parameters to the function

embedding = OpenAIEmbeddings(openai_api_key="sk-",openai_api_base="http://localhost:8080/v1",model="text-embedding-ada-002")

Everything else is exactly the same. I still tried repeating the same steps as he did (exporting the global variables before launching the script) and I still have the same issue.

mike-niemand commented 1 year ago

Nooo.

In one terminal, run Step 1: Start LocalAI details *** remember to copy the models.

It will then be up and running (takes a while to build) but will complete and tell you it's running Fiber v2.46.0 and what IP:PORT its' on.

Then do step 2 in another terminal. This preps the data and sends it to the AI running in the first terminal. You can then query it from terminal 2 or a UI. Remember to set the OPENAI_API_BASE and OPENAI_API_KEY env variables here

neversettle7 commented 1 year ago

Yes, that's exactly what I did and how I got to that issue. I will probably start all over again, just in case I messed something up while attempting to fix the problem. Thanks for the help!

MarcoBoo commented 10 months ago

still not working.. how did you solved ?

ps. im using docker

MarcoBoo commented 10 months ago

Currently i am running step 2 of the tutorial from inside the docker (after creating the docker on another terminal) :

root@923cc17fdb2f:/build/examples/langchain-chroma# export OPENAI_API_BASE=http://localhost:8080/v1 export OPENAI_API_KEY=sk-

root@923cc17fdb2f:/build/examples/langchain-chroma# python3 store.py

this is my .env file ( located in /build/examples/langchain-chroma): THREADS=4 CONTEXT_SIZE=512 MODELS_PATH=/models DEBUG=true

this is my docker-compose.yaml file ( located in /build/examples/langchain-chroma):

version: '3.6'

services: api: image: quay.io/go-skynet/local-ai:latest build: context: ../../ dockerfile: Dockerfile ports:

8080:8080 env_file:
.env volumes:
./models:/models:cached command: ["/usr/bin/local-ai"]

i have put inside /models folder all the required file ( including the models of course), but i still get endpoint disabled for this model by API configuration.

@neversettle7 @mike-niemand @mudler

seanmavley commented 9 months ago

I've tried every possible suggestion I have come across. I'm using the Linux binary in WSL using local-ai-avx-Linux-x86_64

@MarcoBoo are we the only two people facing this issue? What is everybody else doing that we're doing wrong?

Did you get yours to run?

It's real frustrating here. There's just about 3 or so posts about this exact error on the entire internet as at this time, which makes me wonder what isn't said in the docs that magically everybody seems to know but me (us few)

rstaessens commented 7 months ago

hi did you find a solution i m having the same issue . I tried to use flowise + localAI embeddings and have the endpoint error like you . i just try to contact the models using postman POST request http://localhost:8080/embeddings and receive { "error": { "code": 500, "message": "endpoint disabled for this model by API configuration", "type": "" } } while doing get request http://localhost:8080/readyz returns OK

mudler commented 7 months ago

I'm sorry - the tutorial must be updated and it is outdated as for today, you need to set up embeddings: true in the model configuration file, see the docs for an up-to-date example:

https://localai.io/features/embeddings/#huggingface-embeddings

Alternatively, you can just use the quickstart:

docker run -ti -p 8080:8080 localai/localai:v2.9.0-ffmpeg all-minilm-l6-v2

and will start automatically an embedding API server