Closed billsanto closed 1 year ago
Hi, @billsanto! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you reported an issue where Azure rejects tokens sent by OpenAIEmbeddings because it expects strings. You tried modifying the code to send strings instead of tokens, but Azure still complains because it only accepts one input at a time.
Since there hasn't been any activity or comments on this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the LangChain project!
This is still the real issue. Will it be fixed?
System Info
Langchain .216, OS X 11.6, Python 3.11.
Who can help?
No response
Information
Related Components
Reproduction
Expected behavior
OpenAIEmbeddings should return embeddings instead of an error.
Because Azure currently only accepts str input, in contrast to OpenAI which accepts tokens or strings, the input is rejected because OpenAIEmbeddings sends tokens only. Azure embedding API docs confirm this, where the request body input parameter is of type string: https://learn.microsoft.com/en-us/azure/cognitive-services/openai/reference#embeddings
Second, after modifying openai.py to send strings, Azure complains that it currently accepts one input at a time--in other words, it doesn't accept batches of strings (or even tokens if it accepted tokens). So the for loop increment was modified to send one decoded batch of tokens (in other words, the original str chunk) at a time.
Modifying embeddings/openai.py with:
and re-running the code:
Also made the following change to openai.py a few lines later, although this is untested: