swoook / KoBART

Korean BART
Other
0 stars 0 forks source link

Request a feature to export KoBART for sequence classification to ONNX Runtime (ORT) #1

Closed swoook closed 2 years ago

swoook commented 2 years ago

🚀 Feature request

  1. SKT-AI/KoBART requires transformers==4.3.3
  2. transformers>=4.9.0 supports exporting BART to ONNX

Motivation

  1. TensorRT (TRT)
  2. ONNX Runtime (ORT)
  3. OpenVINO
Highlights Which one is better or worse?
performance TRT ≥ ORT
ORT are sometimes on par with TRT
hardware TRT ≤ ORT
TRT only supports NVIDIA GPU
ORT supports NVIDIA GPU and Intel CPU
I cannot find any document about AMD for ORT 🙈
compatibility TRT << ORT
TRT performs device-specific optimizations [1, 2]
For example, an execution engine built for a Nvidia A100 GPU will not work on a Nvidia T4 GPU 🙃
difficulty TRT ≥ ORT

Your contribution

swoook commented 2 years ago
$ python -m transformers.convert_graph_to_onnx --help        usage: ONNX Converter [-h]
                      [--pipeline {feature-extraction,ner,sentiment-analysis,fill-mask,question-answering,text-generation,translation_en_to_fr,translation_en_to_de,translation_en_to_ro}]
                      --model MODEL [--tokenizer TOKENIZER] [--framework {pt,tf}] [--opset OPSET] [--check-loading] [--use-external-format]
                      [--quantize]
                      output

positional arguments:
  output

optional arguments:
  -h, --help            show this help message and exit
  --pipeline {feature-extraction,ner,sentiment-analysis,fill-mask,question-answering,text-generation,translation_en_to_fr,translation_en_to_de,translation_en_to_ro}
  --model MODEL         Model's id or path (ex: bert-base-cased)
  --tokenizer TOKENIZER
                        Tokenizer's id or path (ex: bert-base-cased)
  --framework {pt,tf}   Framework for loading the model
  --opset OPSET         ONNX opset to use
  --check-loading       Check ONNX is able to load the model
  --use-external-format
                        Allow exporting model >= than 2Gb
  --quantize            Quantize the neural network to be run with int8
  1. --model:

    • Hugging Face saves model into two files:
    1. config.json which saves the configuration of your model
    2. pytorch_model.bin which is the PyTorch checkpoint
    • We can pass the directory in which they exist
    • Or it also accepts model's id
    • For example, the model's id of this model is skt/kobert-base-v1
  2. --framework

    • --framework p For PyTorch
    • --framework t For TensorFlow
from dmp_kobart import KoBARTClassification

paths = dict()
paths['ckpt'] = $CKPT_PATH
paths['yaml'] = $YAML_PATH
paths['huggingface'] = $OUTPUT_DIR

pytorch_lightning_model_wrapper = KoBARTClassification.load_from_checkpoint(
    checkpoint_path=paths['ckpt'],
    hparams_file=paths['yaml'],
    map_location=None,
)

pytorch_lightning_model_wrapper.model.save_pretrained(paths['huggingface'])
swoook commented 2 years ago
$ python -m transformers.convert_graph_to_onnx --framework pt \\
--model $MODEL_DIR \\
$ONNX_PATH
====== Converting model to ONNX ======
ONNX opset version set to: 11
Loading pipeline (model: $MODEL_DIR, tokenizer: $MODEL_DIR)
Error while converting the model: Can't load tokenizer for '$MODEL_DIR'. Make sure that:

- '$MODEL_DIR' is a correct model identifier listed on '<https://huggingface.co/models>'

- or '$MODEL_DIR' is the correct path to a directory containing relevant tokenizer files
...
tokenizer = {
    'url':
    '<https://kobert.blob.core.windows.net/models/kobart/kobart_base_tokenizer_cased_cf74400bce.zip>',
    'fname': 'kobart_base_tokenizer_cased_cf74400bce.zip',
    'chksum': 'cf74400bce'
}
...
$ python -m transformers.convert_graph_to_onnx --framework pt \\
--model $MODEL_DIR \\
--tokenizer $TOKENIZER_DIR \\
$ONNX_PATH

====== Converting model to ONNX ======
ONNX opset version set to: 11
Loading pipeline (model: $MODEL_DIR, tokenizer: $TOKENIZER_DIR)
Error while converting the model: Can't load tokenizer for '$TOKENIZER_DIR'. Make sure that:

- '$TOKENIZER_DIR' is a correct model identifier listed on '<https://huggingface.co/models>'

- or '$TOKENIZER_DIR' is the correct path to a directory containing relevant tokenizer files
  1. added_tokens.json [example]
  2. special_tokens_map.json [example]
  3. tokenizer_config.json [example]
  4. tokenizer.json [example]
...
additional_files_names = {
                    "added_tokens_file": ADDED_TOKENS_FILE,
                    "special_tokens_map_file": SPECIAL_TOKENS_MAP_FILE,
                    "tokenizer_config_file": TOKENIZER_CONFIG_FILE,
                    "tokenizer_file": FULL_TOKENIZER_FILE,
                }
...
  1. model.json

    • It has the keys which also exist in the example of tokenizer.json
    • I.e. It seems model.json is tokenizer.json
  2. emji_tokenizer-vocab.json

    • vocab.json
    • However, model.json also include vocab
  1. added_tokens.json [example]
  2. tokenizer_config.json [example]
  3. special_tokens_map.json [example]
swoook commented 2 years ago
====== Converting model to ONNX ======
ONNX opset version set to: 11
Loading pipeline (model: $MODEL_DIR, tokenizer: $TOKENIZER_DIR)
Using framework PyTorch: 1.7.1
Found input input_ids with shape: {0: 'batch', 1: 'sequence'}
Found input attention_mask with shape: {0: 'batch', 1: 'sequence'}
Found output output_0 with shape: {0: 'batch', 1: 'sequence'}
Found output output_1 with shape: {0: 'batch', 2: 'sequence'}
Found output output_1 with shape: {0: 'batch', 2: 'sequence'}
Found output output_1 with shape: {0: 'batch', 2: 'sequence'}
Found output output_1 with shape: {0: 'batch', 2: 'sequence'}
Found output output_2 with shape: {0: 'batch', 2: 'sequence'}
Found output output_2 with shape: {0: 'batch', 2: 'sequence'}
Found output output_2 with shape: {0: 'batch', 2: 'sequence'}
Found output output_2 with shape: {0: 'batch', 2: 'sequence'}
Found output output_3 with shape: {0: 'batch', 2: 'sequence'}
Found output output_3 with shape: {0: 'batch', 2: 'sequence'}
Found output output_3 with shape: {0: 'batch', 2: 'sequence'}
Found output output_3 with shape: {0: 'batch', 2: 'sequence'}
Found output output_4 with shape: {0: 'batch', 2: 'sequence'}
Found output output_4 with shape: {0: 'batch', 2: 'sequence'}
Found output output_4 with shape: {0: 'batch', 2: 'sequence'}
Found output output_4 with shape: {0: 'batch', 2: 'sequence'}
Found output output_5 with shape: {0: 'batch', 2: 'sequence'}
Found output output_5 with shape: {0: 'batch', 2: 'sequence'}
Found output output_5 with shape: {0: 'batch', 2: 'sequence'}
Found output output_5 with shape: {0: 'batch', 2: 'sequence'}
Found output output_6 with shape: {0: 'batch', 2: 'sequence'}
Found output output_6 with shape: {0: 'batch', 2: 'sequence'}
Found output output_6 with shape: {0: 'batch', 2: 'sequence'}
Found output output_6 with shape: {0: 'batch', 2: 'sequence'}
Found output output_7 with shape: {0: 'batch', 1: 'sequence'}
Ensuring inputs are in correct order
decoder_input_ids is not present in the generated input list.
Generated inputs order: ['input_ids', 'attention_mask']
/data/swook/miniconda3/envs/transformers/lib/python3.8/site-packages/torch/onnx/utils.py:1111: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input output_1
  warnings.warn('No names were found for specified dynamic axes of provided input.'
Error while converting the model: The type of axis index is expected to be an integer
  1. #9803 in huggingface/transformers
  2. #11786 in huggingface/transformers
  1. SKT-AI/KoBART requires transformers==4.3.3
  2. transformers should be >=4.9.0 to export BART to ONNX
swoook commented 2 years ago
  1. facebook/bart-base
  2. ynie/bart-large-snli_mnli_fever_anli_R1_R2_R3-nli
swoook commented 2 years ago
$ python -m transformers.onnx \\
--model $MODEL_DIR \\
$ONNX_PATH
Traceback (most recent call last):
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1359, in from_pretrain
ed
    state_dict = torch.load(resolved_archive_file, map_location="cpu")
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/torch/serialization.py", line 595, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/torch/serialization.py", line 764, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 71, in <module>
    main()
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 51, in main
    model = FeaturesManager.get_model_from_feature(args.feature, args.model)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/features.py", line 125, in get_model_from_
feature
    return FeaturesManager._TASKS_TO_AUTOMODELS[task].from_pretrained(model)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 419, in from_pretrained
    return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1364, in from_pretrained
    raise OSError(
OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run `git lfs install` followed by `git lfs pull` in the folder you cloned.
Traceback (most recent call last):
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 71, in <module>
    main()
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 62, in main
    onnx_inputs, onnx_outputs = export(tokenizer, model, onnx_config, args.opset, args.output)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/convert.py", line 90, in export
    raise AssertionError(f"Unsupported PyTorch version, minimum required is 1.8.0, got: {torch_version}")
AssertionError: Unsupported PyTorch version, minimum required is 1.8.0, got: 1.7.1
Using framework PyTorch: 1.10.0
Overriding 1 configuration item(s)
        - use_cache -> False
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/torch/onnx/utils.py:90: UserWarning: 'enable_onnx_checker' is deprec
ated and ignored. It will be removed in the next PyTorch release. To proceed despite ONNX checker failures, catch torch.onnx.ONNXCheckerError.
  warnings.warn("'enable_onnx_checker' is deprecated and ignored. It will be removed in "
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/torch/onnx/utils.py:103: UserWarning: `use_external_data_format' is 
deprecated and ignored. Will be removed in next PyTorch release. The code will work as it is False if models are not larger than 2GB, Otherwise 
set to False because of size limits imposed by Protocol Buffers.
  warnings.warn("`use_external_data_format' is deprecated and ignored. Will be removed in next "
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:215: TracerWarning: Conver
ting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be t
reated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:221: TracerWarning: Conver
ting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be t
reated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:252: TracerWarning: Conver
ting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be t
reated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:879: TracerWarning: Conver
ting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be t
reated as a constant in the future. This means that the trace might not generalize to other inputs!
  if input_shape[-1] > 1:
Traceback (most recent call last):
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 71, in <module>
    main()
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 64, in main
    validate_model_outputs(onnx_config, tokenizer, model, args.output, onnx_outputs, args.atol)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/convert.py", line 142, in validate_model_outputs
    from onnxruntime import InferenceSession, SessionOptions
ModuleNotFoundError: No module named 'onnxruntime'
Using framework PyTorch: 1.10.0
Overriding 1 configuration item(s)
        - use_cache -> False
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/torch/onnx/utils.py:90: UserWarning: 'enable_onnx_checker' is deprec
ated and ignored. It will be removed in the next PyTorch release. To proceed despite ONNX checker failures, catch torch.onnx.ONNXCheckerError.
  warnings.warn("'enable_onnx_checker' is deprecated and ignored. It will be removed in "
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/torch/onnx/utils.py:103: UserWarning: `use_external_data_format' is 
deprecated and ignored. Will be removed in next PyTorch release. The code will work as it is False if models are not larger than 2GB, Otherwise 
set to False because of size limits imposed by Protocol Buffers.
  warnings.warn("`use_external_data_format' is deprecated and ignored. Will be removed in next "
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:215: TracerWarning: Conver
ting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be t
reated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:221: TracerWarning: Conver
ting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be t
reated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:252: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:879: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if input_shape[-1] > 1:
Validating ONNX model...
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:350: UserWarning: Deprecation warning. This ORT build has ['CUDAExecutionProvider', 'CPUExecutionProvider'] enabled. The next release (ORT 1.10) will require explicitly setting the providers parameter (as opposed to the current behavior of providers getting set/registered by default based on the build flags) when instantiating InferenceSession.For example, onnxruntime.InferenceSession(..., providers=["CUDAExecutionProvider"], ...)
  warnings.warn("Deprecation warning. This ORT build has {} enabled. ".format(available_providers) +
        -[✓] ONNX model outputs' name match reference model ({'last_hidden_state', 'encoder_last_hidden_state'}
        - Validating ONNX Model output "last_hidden_state":
                -[✓] (2, 8, 768) matches (2, 8, 768)
                -[✓] all values close (atol: 0.0001)
        - Validating ONNX Model output "encoder_last_hidden_state":
                -[✓] (2, 8, 768) matches (2, 8, 768)
                -[✓] all values close (atol: 0.0001)
All good, model saved at: /data/swook/models/huggingface/facebook/bart-base/onnx/model.onnx
  1. 1st warning

    UserWarning: 'enable_onnx_checker' is deprecated and ignored. It will be removed in the next PyTorch release. To proceed despite ONNX checker failures, catch torch.onnx.ONNXCheckerError.
    • It seems we can safely disregard this warning
  2. 2nd warning

    UserWarning: `use_external_data_format' is deprecated and ignored. Will be removed in next PyTorch release. The code will work as it is False if models are not larger than 2GB, Otherwise 
    set to False because of size limits imposed by Protocol Buffers.
    • It warns we need to set use_external_data_format to False if model is larger than 2GB
    • BART is smaller than 2GB
    • It seems we can safely disregard this warning
  3. 3rd warning

    /data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:215: TracerWarning: Conver
    ting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be t
    reated as a constant in the future. This means that the trace might not generalize to other inputs!
    if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
    • Maintainer of [huggingface/transformers]() says we can disregard these warnings [details]
swoook commented 2 years ago
  • However, I found the dependency conflict below:
  1. SKT-AI/KoBART requires transformers==4.3.3
  2. transformers>=4.9.0 supports exporting BART to ONNX
  1. transformers==4.3.3
  2. transformers>=4.12.5
swoook commented 2 years ago
$ python -m transformers.onnx \
> --model=$PYTORCH_MODEL_DIR \
> --feature sequence-classification \
> $ONNX_MODEL_DIR
Traceback (most recent call last):
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 71, in <module>
    main()
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/__main__.py", line 52, in main
    model_kind, model_onnx_config = FeaturesManager.check_supported_model_or_raise(model, feature=args.feature)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/features.py", line 153, in check_supported_model_or_raise
    raise ValueError(
ValueError: bart doesn't support feature sequence-classification. Supported values are: ['default']
swoook commented 2 years ago
Using framework PyTorch: 1.10.0
Overriding 1 configuration item(s)
        - use_cache -> False
Traceback (most recent call last):
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
    cli.main()
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
    run()
  File "/home/swook/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
    runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/swook/draft/kobart/export2onnx.py", line 74, in <module>
    main()
  File "/data/swook/draft/kobart/export2onnx.py", line 65, in main
    onnx_inputs, onnx_outputs = export(tokenizer, model, onnx_config, args.opset, args.output)
  File "/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/onnx/convert.py", line 111, in export
    raise ValueError("Model and config inputs doesn't match")
ValueError: Model and config inputs doesn't match
  1. input_ids
  2. attention_mask
  1. input_ids
  2. attention_mask
  3. token_type_ids
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'PreTrainedTokenizerFast'. 
The class this function is called from is 'BartTokenizer'.
{...
tokenizer_class: "PreTrainedTokenizerFast"
...}
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'PreTrainedTokenizerFast'. 
The class this function is called from is 'BartTokenizer'.
Using framework PyTorch: 1.10.0
Overriding 1 configuration item(s)
        - use_cache -> False
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/torch/onnx/utils.py:90: UserWarning: 'enable_onnx_checker' is deprecated and ignored. It will be removed in the next PyTorch release. To proceed despite ONNX checker failures, catch torch.onnx.ONNXCheckerError.
  warnings.warn("'enable_onnx_checker' is deprecated and ignored. It will be removed in "
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/torch/onnx/utils.py:103: UserWarning: `use_external_data_format' is deprecated and ignored. Will be removed in next PyTorch release. The code will work as it is False if models are not larger than 2GB, Otherwise set to False because of size limits imposed by Protocol Buffers.
  warnings.warn("`use_external_data_format' is deprecated and ignored. Will be removed in next "
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:215: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:221: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:252: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py:879: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if input_shape[-1] > 1:
Validating ONNX model...
/data/swook/miniconda3/envs/transformers-latest/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:350: UserWarning: Deprecation warning. This ORT build has ['CUDAExecutionProvider', 'CPUExecutionProvider'] enabled. The next release (ORT 1.10) will require explicitly setting the providers parameter (as opposed to the current behavior of providers getting set/registered by default based on the build flags) when instantiating InferenceSession.For example, onnxruntime.InferenceSession(..., providers=["CUDAExecutionProvider"], ...)
  warnings.warn("Deprecation warning. This ORT build has {} enabled. ".format(available_providers) +
        -[✓] ONNX model outputs' name match reference model ({'encoder_last_hidden_state', 'last_hidden_state'}
        - Validating ONNX Model output "last_hidden_state":
                -[✓] (2, 8, 768) matches (2, 8, 768)
                -[x] values not close enough (atol: 0.0001)