Closed guillaume-be closed 2 years ago
Haha, it looks like a bug in the multiple inputs and multiple symbols corner :)
Could you give me your .onnx model so I can try to find our what's wrong ?
Hello,
Thank you for the prompt response. I have uploaded the model at https://drive.google.com/file/d/1-eolYFHieS3v7JAC_dy_n-z1dPH2m0qc/view?usp=sharing (This was converted to ONNX using the weights shared under Apache 2.0 license at https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)
Hello, a few words to tell you I have not forget this issue. I had good hope I was close to make it work with some fixes done for another model, but I'm afraid it's not so. This bert does relatively complicated things around shapes computation and pushes tract shape predictions to their limit... I need to think about how to go this extra mile, whether I can push the current code just enough to handle your model (and the ones doing the same kind of things) or if i need a bigger refactoring of the shape prediction... It's been a while, so if you've moved on and don't care anymore, I will not be offended :)
Hello @kali and thank you very much for the update. I understand the difficulty of running transformers models with ONNX. There seems to have been one successful attempt at running these models on Rust, although relying on onnxruntime
: https://github.com/haixuanTao/onnxruntime-rs. Maybe this provides some hints as to what may help running the same.
I am still very much interested in seeing such capabilities in Tract. This library seems to be one of the best maintained for ONNX inference in the Rust ecosystem - and I would like to make it the platform of choice for the ONNX capabilities of the library I am working on. I have seen very promising speed-ups from the implementation of the post-processing pipeline of NLP models in Rust using Pytorch bindings (see here if interested). I expect the performance benefits offered by ONNX would synergize well with these improvements for high performance text generation.
Thanks! Happy to see you're not giving up on us. And I'm not giving up, we'll get there... eventually.
I think #689 may be it :)
Hello @kali ,
Thank you for looking into this and proposing this fix, and apologies for not getting back to you earlier. I have tested the code given in this issue again and I am still facing the same error - is this an issue on my end?
Damn. Just had a quick look at your code, can you try calling into_optimized() before into_runnable() ? Because I'm pretty sure I checked it was working. If that's not it, I'll re-setup the test case.
This may be because I generated the model again using updated utilities from Huggingface:
python -m transformers.onnx --model=distilbert-base-uncased-finetuned-sst-2-english --feature=sequence-classification distilbert-sst-onnx
I then created and tested an optimized version of this model with
optimized_model = optimizer.optimize_model(path_to_model, model_type='bert', num_heads=12, hidden_size=768)
optimized_model.save_model_to_file(path_to_optimized_model)
I have uploaded both files for your reference: non-optimized and optimized
For the non-optimized model:
I just tried adding into_optimized()
, and now run in the following issue:
Error: Translating node #43 "Slice_6" StridedSlice ToTypedTranslator
. Note that the optimization seems to be much slower than with the Python onnxruntime
library so I am not sure the operations are equivalent.
For the optimized version:
into_optimized
leads to the same error described in this issue -- does this mean that when using tract
I would need to re-optimized at each model load, and won't be able to load model optimized with onnxruntime
?into_optimized
fails with:
Error: Failed analyse for node #95 "EmbedLayerNormalization_0" Unimplemented(EmbedLayerNormalization)
Caused by: Wrong nnumber of outputs. Op says 1, node says 2.
Hey, we have a Albert example here that worked a few months ago. https://github.com/sonos/tract/tree/main/examples/pytorch-albert-v2
EmbedLayerNormalization
? I can find it in ONNX operators list (see https://github.com/onnx/onnx/blob/main/docs/Operators.md ) , so I'm not sure what is happening there.
Hello @kali ,
I believe this may be because the optimization is done using onnxruntime
(see https://github.com/microsoft/onnxruntime/blob/master/docs/ContribOperators.md#com.microsoft.EmbedLayerNormalization). Changing the optimization level before serializing the model, I now see both the optimized and non-optimized models exhibiting the same behaviour:
Error: Translating node #43 "Slice_6" StridedSlice ToTypedTranslator
I cannot find this StridedSlice
operator in neither the ONNX operators nor the onnxruntime
custom operators.
Ok, so it looks like Microsoft is doing the old IE CSS trick again, adding stuff to a standard as a way to lock people in. That's just great. And there are quite a bunch of them too. Obviously I can not support all these com.microsoft operators right away, so please no optimizing using onnxruntime for now.
The StridedSlice is basically the Slice operator from onnx (the name comes from tensorflow). I just reproduced the problem, I'm having a look.
Hello,
I am trying looking into Tract as an ONNX runtime for language models - with the goal of eventually integrating it to https://github.com/guillaume-be/rust-bert. I have exported a BERT-like model using the new utilities offered by transformers.onnx, and I am able to run the model without issue using
onnxruntime
in Python.I am trying to load the model using Tract (which works), but unfortunately fails when running it on a selected input:
where
attention_mask
is the second input. This is my first try at using the Tract API - am I missing something?