infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
23.25k stars 2.27k forks source link

[Bug]: Error Unspported Data Type when Embedding using Open Azure #2840

Open rendybjunior opened 1 month ago

rendybjunior commented 1 month ago

Is there an existing issue for the same bug?

Branch name

main

Commit ID

main

Other environment information

Debian VM

Actual behavior

When embedding file, it is error. Why, because Open AI Azure base url is NOT really base url. It doesn't work unless you put the full URL for chat completion. Meanwhile OpenAI Azure has DIFFERENT url for each model and task and purpose.

Expected behavior

Embedding not error and success.

Steps to reproduce

Setup OpenAI Azure model, setup LLM and embedding BOTH to OpenAI Azure.
Go to Knowledge base, upload any file pdf, and see error Unsupported data type.

Additional information

No response

KevinHuSh commented 1 month ago

So, what kind of file you did you upload and what chunking method did you choose?

rendybjunior commented 1 month ago

So, what kind of file you did you upload and what chunking method did you choose?

I was uploading pdf file, I'm using default chunking method ("General").

I believe the root cause is the url path put in the setting. Azure OpenAI has different path and different model names for each usage. We can't put one url config for multiple purpose (LLM, Embedding, etc). It has to be one url per model as each has different "Deployment Name".

Since the configuration validate for chat completion, the embedding doesn't work.

KevinHuSh commented 1 month ago

Got it.