Open felladrin opened 1 year ago
I think this is due to a recent update of optimum - which will be fixed in the next release. One problem I am aware of, however, is that trying to convert large models (e.g., 3b param models) will produce an external data file, which is currently not supported by Transformers.js.
Are you able to export with optimum directly (which is what our conversion script uses behind the scenes)? Your command should look something like:
optimum-cli export onnx -m lmsys/fastchat-t5-3b-v1.0 output
Ah, nice! Using that command made it generate some more files. But the command exited with the following exception:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/homebrew/bin/optimum-cli", line 8, in <module>
sys.exit(main())
File "/opt/homebrew/lib/python3.10/site-packages/optimum/commands/optimum_cli.py", line 163, in main
service.run()
File "/opt/homebrew/lib/python3.10/site-packages/optimum/commands/export/onnx.py", line 219, in run
main_export(
File "/opt/homebrew/lib/python3.10/site-packages/optimum/exporters/onnx/__main__.py", line 366, in main_export
raise Exception(
Exception: An error occured during validation, but the model was saved nonetheless at output. Detailed error: [ONNXRuntimeError] : 1 : FAIL : Load model from output/decoder_model_merged.onnx failed:/Users/runner/work/1/s/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto &&, const onnxruntime::PathString &, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList *, const logging::Logger &, const onnxruntime::ModelOptions &) Unsupported model IR version: 9, max supported IR version: 8
List of files generated (including the encoder_model.onnx
, which was not being generated by the conversion script):
added_tokens.json
config.json
decoder_model.onnx
decoder_model.onnx_data
decoder_model_merged.onnx
decoder_model_merged.onnx_data
decoder_with_past_model.onnx
decoder_with_past_model.onnx_data
encoder_model.onnx
encoder_model.onnx_data
generation_config.json
special_tokens_map.json
spiece.model
tokenizer.json
tokenizer_config.json
Thanks 👍 The first error message you received looks like a bug with optimum. If you'd like, you can raise an issue on their repo.
For now (even if you did get conversion working), the model is currently just slightly too large to run with the current version of Transformers.js (which doesn't support the external format: .onnx_data
). I will hopefully get around to adding support for it in the coming week or so, but I am just prioritizing other things first.
Thanks for your work on this lib!
The pipeline didn't work for lmsys/fastchat-t5-3b-v1.0
, although all the files were there. When running transformers.js in Node, and it's exiting with the following message:
Error: no available backend found. ERR:
Error: Failed to load model with error: /Users/runner/work/1/s/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto &&, const onnxruntime::PathString &, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList *, const logging::Logger &, const onnxruntime::ModelOptions &) Unsupported model IR version: 9, max supported IR version: 8
But I'm happy it's close to getting it working! The other T5 model (LaMini-Flan-T5-783M
) is working great, and it's already pretty good for its purpose.
Right, it's failing because the .onnx file does not contain all the model parameters. Those parameters are stored in .onnx_data.
This is due to a limitation with protobuf, which has a 2GB limit. See here for more information.
This also means optimum's validation code doesn't load the external data format (which also appears to be a bug).
However, I do intend to add support for the external data format :) Just got a lot on my plate right now haha.
Updated title to be a feature request for the external data file format (which is used for models larger than 2GB).
@xenova Does transformers.js support models larger than 2 GB yet?
Any news on this one? I am trying to load bge-m3 (full, not quantized, dtype='fp32') using the v3 branch (onnxruntime-1.17.x) in nodejs and I am getting the following error:
Error: Deserialize tensor embeddings.word_embeddings.weight failed.GetFileLength for ./model.onnx_data failed:Invalid fd was supplied: -1
(which is weird, since the model.onnx_data file is there).
Pinging for any updates on this too! I am trying to load Xenova/TinyLlama-1.1B-Chat-v1.0 using the v3 branch (onnxruntime-1.17.x) and there is a decoder_model_merged.onnx_data file (tried renaming it to model.onnx_data to no avail) but I still get this.
Error: ERROR_CODE: 1, ERROR_MESSAGE: Deserialize tensor onnx::MatMul_7210 failed.Failed to load external data file "./model.onnx_data", error: Module.MountedFiles is not available.
Not really an bug with Transformers.js, but with the conversion script.
Got an error when trying to convert lmsys/fastchat-t5-3b-v1.0 with
text2text-generation-with-past
task. Using tasktext2text-generation
works fine, though.Am I missing something?
And is there a way to run the model without it being created with
-with-past
? Currently when I runconst pipe = await pipeline("text2text-generation", "lmsys/fastchat-t5-3b-v1.0");
it triggersFiles inside
models/lmsys/fastchat-t5-3b-v1.0/seq2seq-lm-with-past/
are the following:How to reproduce
Run:
Expect an output like this:
Expected behavior
Was expecting it to work, the same way
python -m scripts.convert --model_id lmsys/fastchat-t5-3b-v1.0 --from_hub --quantize --task text2text-generation
worked.Environment