Open jcdiv47 opened 5 months ago
I temporarily made stop_words
an empty list. It worked in my case.
Ah that's right, thanks for pointing this out. The openai api only support up to 5 stop words. Also worth noting is that, for certain inference engines, you might see half a stop word sequence produced before it got cut, if the engine does not withhold tokens for stop word detection before streaming.
For those who, for any reason, want to deploy the project with vanilla openai api instead of lepton-hosted LLMs, make sure to limit the number of stop words to less than or equal to 4(per openai api doc)
https://github.com/leptonai/search_with_lepton/blob/432436575eb64028a8b2fae62ca345d83a06b189/search_with_lepton.py#L65-L74
or even not pass the
stop
argument at all:https://github.com/leptonai/search_with_lepton/blob/432436575eb64028a8b2fae62ca345d83a06b189/search_with_lepton.py#L607
Otherwise, deploying the project out of the box would be encountering errors similar to the following:
With that being said, did some quick tests and played around, decided to stick to lepton-hosted Mixtral for now for the smooth ride.