Ki6an / fastT5

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
Apache License 2.0
565 stars 72 forks source link

No such file or directory: '/content/encoder.embed_tokens.weight' #65

Open alexfdo opened 1 year ago

alexfdo commented 1 year ago
from fastT5 import export_and_get_onnx_model
from transformers import AutoTokenizer

model_name = 'google/mt5-xl'
saved_model_path = '/content/drive/MyDrive/Colab Notebooks Work/onnx_models/'
model = export_and_get_onnx_model(model_name, custom_output_path=saved_model_path)
Exporting to onnx... |################################| 3/3
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
[<ipython-input-1-a31804ad02f8>](https://localhost:8080/#) in <module>
      4 model_name = 'google/mt5-xl'
      5 saved_model_path = '/content/drive/MyDrive/Colab Notebooks Work/onnx_models/'
----> 6 model = export_and_get_onnx_model(model_name,
      7                                   custom_output_path=saved_model_path)

6 frames
[/usr/local/lib/python3.8/dist-packages/fastT5/onnx_models.py](https://localhost:8080/#) in export_and_get_onnx_model(model_or_model_path, custom_output_path, quantized)
    217     if quantized:
    218         # Step 2. (recommended) quantize the converted model for fast inference and to reduce model size.
--> 219         quant_model_paths = quantize(onnx_model_paths)
    220 
    221         # step 3. setup onnx runtime

[/usr/local/lib/python3.8/dist-packages/fastT5/onnx_exporter.py](https://localhost:8080/#) in quantize(models_name_or_path)
    278         model_name = model.as_posix()
    279         output_model_name = f"{model_name[:-5]}-quantized.onnx"
--> 280         quantize_dynamic(
    281             model_input=model_name,
    282             model_output=output_model_name,

[/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quantize.py](https://localhost:8080/#) in quantize_dynamic(model_input, model_output, op_types_to_quantize, per_channel, reduce_range, activation_type, weight_type, nodes_to_quantize, nodes_to_exclude, optimize_model, use_external_data_format, extra_options)
    306         op_types_to_quantize = list(IntegerOpsRegistry.keys())
    307 
--> 308     model = load_model(Path(model_input), optimize_model)
    309     quantizer = ONNXQuantizer(
    310         model,

[/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quantize.py](https://localhost:8080/#) in load_model(model_path, optimize)
     51         return onnx_model.model
     52 
---> 53     return onnx.load(Path(model_path))
     54 
     55 

[/usr/local/lib/python3.8/dist-packages/onnx/__init__.py](https://localhost:8080/#) in load_model(f, format, load_external_data)
    138         if model_filepath:
    139             base_dir = os.path.dirname(model_filepath)
--> 140             load_external_data_for_model(model, base_dir)
    141 
    142     return model

[/usr/local/lib/python3.8/dist-packages/onnx/external_data_helper.py](https://localhost:8080/#) in load_external_data_for_model(model, base_dir)
     62     for tensor in _get_all_tensors(model):
     63         if uses_external_data(tensor):
---> 64             load_external_data_for_tensor(tensor, base_dir)
     65             # After loading raw_data from external_data, change the state of tensors
     66             tensor.data_location = TensorProto.DEFAULT

[/usr/local/lib/python3.8/dist-packages/onnx/external_data_helper.py](https://localhost:8080/#) in load_external_data_for_tensor(tensor, base_dir)
     41     external_data_file_path = os.path.join(base_dir, file_location)
     42 
---> 43     with open(external_data_file_path, "rb") as data_file:
     44 
     45         if info.offset:

FileNotFoundError: [Errno 2] No such file or directory: '/content/encoder.embed_tokens.weight'
JCRPaquin commented 1 year ago

Should be fixed by https://github.com/onnx/onnx/pull/4907. External data wasn't being loaded due to an integration issue between onnx and onnxruntime.

anjaliai91 commented 1 year ago

I see that the changes have not been merged yet. Is there a specific branch that can be used to get over the above error ?

JCRPaquin commented 1 year ago

@anjaliai91 I believe onnxruntime added some code on their side to work around this particular issue; a newer version might resolve this for you. Try updating the onnxruntime version in setup.py to the latest version, then do a local install with pip install -e . per the install instructions in the README.