leptonai / search_with_lepton

Building a quick conversation-based search demo with Lepton AI.
https://search.lepton.run
Apache License 2.0
7.54k stars 949 forks source link

Factory `stop_words` setting causing error with openai API endpoints #35

Open jcdiv47 opened 5 months ago

jcdiv47 commented 5 months ago

For those who, for any reason, want to deploy the project with vanilla openai api instead of lepton-hosted LLMs, make sure to limit the number of stop words to less than or equal to 4(per openai api doc)

https://github.com/leptonai/search_with_lepton/blob/432436575eb64028a8b2fae62ca345d83a06b189/search_with_lepton.py#L65-L74

or even not pass the stop argument at all:

https://github.com/leptonai/search_with_lepton/blob/432436575eb64028a8b2fae62ca345d83a06b189/search_with_lepton.py#L607

Otherwise, deploying the project out of the box would be encountering errors similar to the following:

Error code: 400 - {'error': {'message': "'$.stop' is invalid. Please check the API reference: https://platform.openai.com/docs/api-reference.", 'type': 'invalid_request_error', 'param': None, 'code': None}}

With that being said, did some quick tests and played around, decided to stick to lepton-hosted Mixtral for now for the smooth ride.

tsubasakong commented 5 months ago

I temporarily made stop_words an empty list. It worked in my case.

Yangqing commented 5 months ago

Ah that's right, thanks for pointing this out. The openai api only support up to 5 stop words. Also worth noting is that, for certain inference engines, you might see half a stop word sequence produced before it got cut, if the engine does not withhold tokens for stop word detection before streaming.