Closed caixiaomao closed 1 month ago
Thanks so much for reporting this. I've assigned it to the 0.8.0 release.
Hi @caixiaomao , Can you please elaborate on this issue?
For example, If you need the all-MiniLM-L6-v2
ONNX model, later is already generated and used by default by the TransformersEmbeddingClient
.
So you can use it as a Bean:
@Bean
public EmbeddingClient transformersEmbeddingClient() {
return new TransformersEmbeddingClient();
}
Or create it manually, in which case you have to explicitly call the afterPropertiesSet()
.
TransformersEmbeddingClient embeddingClient = new TransformersEmbeddingClient();
embeddingClient.afterPropertiesSet();
(have a look a the TransformersEmbeddingClientTests
)
In both cases the all-MiniLM-L6-v2
model is loaded.
If you are building your own ONNX model , the documentation suggests and i've been successfully used the following command:
optimum-cli export onnx --model sentence-transformers/all-MiniLM-L6-v2 onnx-output-folder
Not sure what your parameters --framework pt --monolith --task feature-extraction
are supposed to do?
@tzolov Thank you for your reply! If the above parameters are not used, the following error will be reported during model conversion。Maybe this is because of using a local model? RuntimeError: Cannot infer the task from a local directory yet, please specify the task manually.
I have reviewed the official documentation and found the following content:
This exports an ONNX graph of the checkpoint defined by the --model argument. As you can see, the task was automatically detected. This was possible because the model was on the Hub.
For local models, providing the --task argument is needed or it will default to the model architecture without any task specific head
@caixiaomao I'm not that familiar with huggingface optimum nor is spring-ai dependent on it. Perhaps you will find more help about how to use Optimum/ONNX on the huggingface optimum forum?
For the purpose of the TransformersEmbeddingClient
you should be able to use any transformer model exported into ONNX Runtime format. There many different ways an tools for ONNX exporting. The optimum-cli is one of them that makes it easy to generate the ONNX from the models available on Huggingface's Hub.
Having said this should you use hub model reference: --model sentence-transformers/all-MiniLM-L6-v2
rather pointing to a local copy?
Also if find the Netron useful to explore the ONNX graphs visually.
Thank you very much, I will give it a try.
FWIW, I'm seeing the same error with the 0.8.1-SNAPSHOT
Minimal steps to reproduce:
1) download model (to resources)
optimum-cli export onnx --model sentence-transformers/all-MiniLM-L6-v2 all-MiniLM-L6-v2
2) Try to create client:
TransformersEmbeddingClient embeddingClient = new TransformersEmbeddingClient();
embeddingClient.setTokenizerResource("classpath:/all-MiniLM-L6-v2/tokenizer.json")
embeddingClient.setModelResource("classpath:/all-MiniLM-L6-v2/model.onnx");
embeddingClient.afterPropertiesSet();
==> fails with error
Caused by: java.lang.IllegalArgumentException: The generative output names doesn't contain expected: last_hidden_state
at org.springframework.util.Assert.isTrue(Assert.java:111)
at org.springframework.ai.transformers.TransformersEmbeddingClient.afterPropertiesSet(TransformersEmbeddingClient.java:187)
The model is loaded correctly, and outputs are visible in the debugger, but the outputs do not match the expected (has "token_embeddings" and "sentence_embedding" instead of the expected "last_hidden_state")
using the default model works:
TransformersEmbeddingClient embeddingClient = new TransformersEmbeddingClient();
embeddingClient.afterPropertiesSet();
This seems to load model from https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-transformers/src/main/resources/onnx/all-MiniLM-L6-v2/model.onnx, which does, indeed have the expected output. Quick look in Netron reveals quite a bit of other differences as well, which may indicate that the root cause is difference in the optimum or related environment. Here's what I used:
- `optimum` version: 1.17.1
- `transformers` version: 4.38.1
- Platform: macOS-14.3.1-arm64-arm-64bit
- Python version: 3.9.6
- Huggingface_hub version: 0.20.3
- PyTorch version (GPU?): 1.13.1 (cuda availabe: False)
- Tensorflow version (GPU?): 2.14.0 (cuda availabe: True)
Hi Everyone, I encountered the same error while loading my model other than the default. Spring-AI Version used 0.8.1
I exported the sentence-transformers/all-mpnet-base-v2 model into onnx format using the optimum cli as specified in the documentation
optimum-cli export onnx --model sentence-transformers/all-mpnet-base-v2 onnx-output-folder
Create the embedding client
@Bean("transformersEmbeddingClient")
public EmbeddingClient embeddingClient() {
return new TransformersEmbeddingClient();
}
Add the below properties in application.properties file, my model files are placed in the classpath.
spring.ai.embedding.transformer.tokenizer.uri=classpath:/embeddingModel/tokenizer.json
spring.ai.embedding.transformer.onnx.modelUri=classpath:/embeddingModel/model.onnx
On startup, I am getting the below error.
Caused by: java.lang.IllegalArgumentException: The generative output names don't contain expected: last_hidden_state
at org.springframework.util.Assert.isTrue(Assert.java:111) ~[spring-core-6.1.5.jar:6.1.5]
at org.springframework.ai.transformers.TransformersEmbeddingClient.afterPropertiesSet(TransformersEmbeddingClient.java:202) ~[spring-ai-transformers-0.8.1.jar:0.8.1]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1833) ~[spring-beans-6.1.5.jar:6.1.5]
at org.springfra
Edit 1: Somehow I managed to get rid of this problem by manually setting the model output name
@Bean("transformersEmbeddingClient")
public EmbeddingClient embeddingClient() throws Exception {
TransformersEmbeddingClient embeddingClient = new TransformersEmbeddingClient();
embeddingClient.setTokenizerResource("classpath:/embeddingModel/tokenizer.json");
embeddingClient.setModelResource("classpath:/embeddingModel/model.onnx");
embeddingClient.setModelOutputName("sentence_embedding");
embeddingClient.afterPropertiesSet();
return embeddingClient;
}
After this, I was able to run the server but started getting below error on calling the call()
method of EmbeddingClient
class [[F cannot be cast to class [[[F ([[F and [[[F are in module java.base of loader 'bootstrap')
at org.springframework.ai.transformers.TransformersEmbeddingClient.call(TransformersEmbeddingClient.java:278) ~[spring-ai-transformers-0.8.1.jar:0.8.1]
at org.springframework.ai.transformers.TransformersEmbeddingClient.embed(TransformersEmbeddingClient.java:232) ~[spring-ai-transformers-0.8.1.jar:0.8.1]
at org.springframework.ai.transformers.TransformersEmbeddingClient.embed(TransformersEmbeddingClient.java:212) ~[spring-ai-transformers-0.8.1.jar:0.8.1]
Does it mean the TransformersEmbeddingClient
implementation is specific to the default model i.e. sentence-transformers/all-MiniLM-L6-v2?
Edit 2: After more research, I got to know the model all-mpnet-base-v2 has two outputs,
The actual embeddings with the correct tensors are present in the token_embeddings output. So on changing the code as shown below, I can generate the embeddings with the model of my choice.
@Bean("transformersEmbeddingClient")
public EmbeddingClient embeddingClient() throws Exception {
TransformersEmbeddingClient embeddingClient = new TransformersEmbeddingClient();
embeddingClient.setTokenizerResource("classpath:/embeddingModel/tokenizer.json");
embeddingClient.setModelResource("classpath:/embeddingModel/model.onnx");
embeddingClient.setModelOutputName("token_embeddings");
embeddingClient.afterPropertiesSet();
return embeddingClient;
}
I think we should put this in the documentation to identify the name of the model output name and override it by using the above code.
One way to check the model output names is through logs of class TransformersEmbeddingClient
logger.info("Model output names: " + onnxModelOutputs.stream().collect(Collectors.joining(", ")));
The second way is to load the model on netron and analyze the model outputs.
Same problem here. This last solution by @kush-298 works for me. Here is my full code: https://github.com/diegopacheco/ai-playground/tree/main/pocs/spring-ai-ONNX-transformers-all-MiniLM-L6-v2/project
@kush-298 did you understand why sentence_embedding is not valid? I want to work with a model that only has sentence_embedding output. Would you have any leads on how to make it work?
I'm having the same issue
We have documented what to do in this case here - https://docs.spring.io/spring-ai/reference/api/embeddings/onnx.html#_errors_and_special_cases
If there is any addition you would like to see anything else added to the documentation, please reopen the issue.
Please do a quick search on GitHub issues first, there might be already a duplicate issue for the one you are about to create. If the bug is trivial, just go ahead and create the issue. Otherwise, please take a few moments and fill in the following sections:
Bug description According to the official documentation at https://docs.spring.io/spring-ai/reference/api/embeddings/onnx.html, the following error is reported:
Model conversion command:
Environment spring-ai.version: 0.8.0-SNAPSHOT jdk: 21
Steps to reproduce Steps to reproduce the issue.
Expected behavior A clear and concise description of what you expected to happen.
Minimal Complete Reproducible example Please provide a failing test or a minimal complete verifiable example that reproduces the issue. Bug reports that are reproducible will take priority in resolution over reports that are not reproducible.