Open kouskouss opened 1 week ago
I found a similar closed issue regarding a problem with the Hugging Face embedding model and LlamaIndex versions [1]. Although it doesn't directly address the KeyError: 0
exception you're encountering, it might be related to compatibility issues between different versions of the library.
For your specific issue, ensure that the request payload format matches the expected structure for the TextEmbeddingsInference
class, which should be a JSON object like this:
{
"inputs": ["text1", "text2", ...],
"truncate": true_or_false
}
Make sure that the response from the Hugging Face inference endpoint is structured correctly and that all necessary keys are present to avoid KeyError
exceptions. Additionally, verify that you are using the correct dependencies and library versions required for the TextEmbeddingsInference
class to function properly with Hugging Face endpoints:
python
: version >=3.8.1,<4.0
llama-index-core
: version ^0.11.0
huggingface-hub
: version >=0.19.0
with the inference
extrasentence-transformers
: version >=2.6.1
[2][3][4].If the issue persists, you might want to check the specific configuration settings or parameters required for the TextEmbeddingsInference
class, such as model_name
, token
, timeout
, and others, to ensure they are correctly set up [5][6].
To continue talking to Dosu, mention @dosu.
@kouskouss I think you used the wrong class? This class is meant for https://github.com/huggingface/text-embeddings-inference
You probably meant to use HuggingFaceInferenceAPIEmbedding
?
embed_model = HuggingFaceInferenceAPIEmbedding(model_name=API_URL, token=key)
?
@logan-markewich I tried this class too and then i tried to call:
async def get_embedding():
text = "Test"
embedding = await embed_model.aget_text_embedding(text)
print(embedding)
# Run the async function
await get_embedding()
and it returned an array of nan values, it did not work either. So in my understanding text inference do not support embedding model that are deployed on inference endpoint in hugging face.
Question Validation
Question
We recently created an inference endpoint on huggingface, which we use to run a finetuned embedding model on their GPU service. While the endpoint itself functions as intended, the request format it provides is difficult to integrate with our existing codebase. For a more seamless integration, we attempted to utilize the TextEmbeddingsInference class from llama_index.embeddings.text_embeddings_inference, as it would allow for a more compatible fit with our current implementation. Specifically, we created the object using the following setup:
embed_model = TextEmbeddingsInference( model_name=model, base_url=API_URL, timeout=60, auth_token=key)
However, when we callembed_model.get_text_embedding("Test")
, we encounter aKeyError: 0 exception
. Does anybody meet the same issue or can provide guidance on this issue or suggest any corrections to our approach? Any assistance in resolving this would be greatly appreciated, as using TextEmbeddingsInference directly would substantially streamline our integration process.Thank you in advance.