huggingface / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
https://huggingface.co/docs/transformers.js
Apache License 2.0
12.25k stars 778 forks source link

ONNX converion of gpt2 does not return same directory structure as the official one #418

Closed uahmad235 closed 1 year ago

uahmad235 commented 1 year ago

Description of bug

I am trying to run gpt2 model after converting it via command:

python -m scripts.convert --task text-generation-with-past  --model_id gpt2 

Here's the files i get after succesful conversion:

config.json             merges.txt  special_tokens_map.json  tokenizer_config.json
generation_config.json  onnx/model.onnx        tokenizer.json           vocab.json

Please note that i do not have multiple files in onnx directory as in the Xenova/gpt2 model.

I tried with different models such as mt5 and it works fine returning all the desired files in onnx/ directory. I need to convert gpt2 model because i want to use a different version of the gpt2 model on huggingface for my particular usecase and hence i cannot use the one provided here.

Maybe i am missing something here. Any help is appreciated

Steps to reproduce Run the conversion command for gpt2 model:

python -m scripts.convert --task text-generation-with-past  --model_id gpt2 

And check contents of onnx/ directory

Expected behavior I am expecting multiple files in onnx/ directory.

Logs/screenshots Conversion logs:

Framework not specified. Using pt to export to ONNX.
Using the export variant default. Available variants are:
    - default: The default ONNX variant.
Using framework PyTorch: 2.0.1+cu117
Overriding 2 configuration item(s)
    - use_cache -> True
    - pad_token_id -> 0
/home/ubuntu/venv_stt/lib/python3.10/site-packages/transformers/models/gpt2/modeling_gpt2.py:801: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if batch_size <= 0:
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

Post-processing the exported models...
Deduplicating shared (tied) weights...
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
    lm_head.weight: {'onnx::MatMul_3751'}
    transformer.wte.weight: {'transformer.wte.weight'}
Removing duplicate initializer onnx::MatMul_3751...
Validating ONNX model models/gpt2/model.onnx...
    -[✓] ONNX model output names match reference model (present.3.value, present.0.value, present.4.value, present.9.key, present.1.value, present.7.key, present.7.value, present.3.key, present.9.value, present.8.key, present.11.key, present.5.key, present.4.key, present.11.value, present.6.key, present.2.key, present.1.key, present.5.value, present.8.value, present.0.key, present.2.value, present.6.value, present.10.key, logits, present.10.value)
    - Validating ONNX Model output "logits":
        -[✓] (2, 16, 50257) matches (2, 16, 50257)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.0.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.0.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.1.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.1.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.2.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.2.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.3.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.3.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.4.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 4.291534423828125e-05 (atol: 1e-05)
    - Validating ONNX Model output "present.4.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.5.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 2.47955322265625e-05 (atol: 1e-05)
    - Validating ONNX Model output "present.5.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.6.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 3.719329833984375e-05 (atol: 1e-05)
    - Validating ONNX Model output "present.6.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.7.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 2.47955322265625e-05 (atol: 1e-05)
    - Validating ONNX Model output "present.7.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.8.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 1.4573335647583008e-05 (atol: 1e-05)
    - Validating ONNX Model output "present.8.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 1.621246337890625e-05 (atol: 1e-05)
    - Validating ONNX Model output "present.9.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 1.3589859008789062e-05 (atol: 1e-05)
    - Validating ONNX Model output "present.9.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.10.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.10.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 1.2874603271484375e-05 (atol: 1e-05)
    - Validating ONNX Model output "present.11.key":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[✓] all values close (atol: 1e-05)
    - Validating ONNX Model output "present.11.value":
        -[✓] (2, 12, 32, 64) matches (2, 12, 32, 64)
        -[x] values not close enough, max diff: 1.6242265701293945e-05 (atol: 1e-05)
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:
- present.4.key: max diff = 4.291534423828125e-05
- present.5.key: max diff = 2.47955322265625e-05
- present.6.key: max diff = 3.719329833984375e-05
- present.7.key: max diff = 2.47955322265625e-05
- present.8.key: max diff = 1.4573335647583008e-05
- present.8.value: max diff = 1.621246337890625e-05
- present.9.key: max diff = 1.3589859008789062e-05
- present.10.value: max diff = 1.2874603271484375e-05
- present.11.value: max diff = 1.6242265701293945e-05.
 The exported model was saved at: models/gpt2

Environment

xenova commented 1 year ago

Hi there 👋 Yes, this is something we're aware of, and is due to a recent improvement to optimum where we don't need to export multiple decoders anymore. While we get this sorted, you can just rename model.onnx to decoder_model_merged.onnx and model_quantized.onnx to decoder_model_merged_quantized.onnx

uahmad235 commented 1 year ago

Thank you for the clarification @xenova. I actually suspected that already and renamed the file but here's the issue i got after that:

Uncaught (in promise) Error: An error occurred during model execution: "Missing the following inputs: position_ids.

I thought it's because of conversion issue.

Here's my code for loading model:


const textgen = await pipeline( 
    'text-generation',
    'gpt2',
    {
        quantized: true,
    },
);
xenova commented 1 year ago

That should have been fixed by a recent update. Are you sure you are using the latest version (2.9.0)?

uahmad235 commented 1 year ago

Oh sorry i was using an older version 2.7.0. Thank you for helping with this. closing now.