Open loretoparisi opened 1 year ago
can you please provide the reproducible code and output you are getting?
does it work with python ort or are you facing the issue only with the js version of ort?
facing
I'm facing this error in transformer-js that is using ONNX converted model here
can you please provide the reproducible code and output you are getting?
Yes, I will fork the original repo and apply the changes.
@Ki6an here my fork where you can try it
This wil install the app and convert flan-t5-small to onnx:
git clone https://github.com/loretoparisi/transformers-js.git
cd transformers-js/
pip install transformers
python tools/convert_model.py
You will find then the quantized models in the /models
folder.
to run it
make demo
make run
now points to http://localhost:8152/?model_id=google/flan-t5-small
The tokenizer code is located here.
I have converted google flan-t5-small using
fastT5.export_and_get_onnx_model
method with quantization enabled by defaults:getting the quantized onnx models:
Anyways when loading the model with a ONNX runtime
ort.InferenceSession
:generated tokens look strange.
Using the same process for the t5-small it works fine.