Closed shimoshida closed 2 years ago
I figured out the above reason. While ONNXconfig file is wrong, I explicit use of GPU, leading to segmentation fault. Correct config is here
class GPTJOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
]
)
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("last_hidden_state", {0: "batch", 1: "sequence"}),
]
)
After modification and use of CPU I can pass the error, but many files are generated... How can I fix?
root@8909446e3559:/mashim# ls outputs/
14164 14316 14443 14595 14722 14874 15001 transformer.h.12.ln_1.weight transformer.h.19.ln_1.bias transformer.h.24.mlp.fc_out.bias transformer.h.6.mlp.fc_in.bias
14165 14317 14444 14596 14723 14875 15002 transformer.h.12.mlp.fc_in.bias transformer.h.19.ln_1.weight transformer.h.25.ln_1.bias transformer.h.6.mlp.fc_out.bias
14166 14318 14445 14597 14724 14876 15003 transformer.h.12.mlp.fc_out.bias transformer.h.19.mlp.fc_in.bias transformer.h.25.ln_1.weight transformer.h.7.ln_1.bias
14192 14319 14471 14598 14750 14877 15029 transformer.h.13.ln_1.bias transformer.h.19.mlp.fc_out.bias transformer.h.25.mlp.fc_in.bias transformer.h.7.ln_1.weight
14193 14320 14472 14599 14751 14878 15030 transformer.h.13.ln_1.weight transformer.h.2.ln_1.bias transformer.h.25.mlp.fc_out.bias transformer.h.7.mlp.fc_in.bias
14194 14321 14473 14600 14752 14879 15031 transformer.h.13.mlp.fc_in.bias transformer.h.2.ln_1.weight transformer.h.26.ln_1.bias transformer.h.7.mlp.fc_out.bias
14195 14347 14474 14626 14753 14905 15032 transformer.h.13.mlp.fc_out.bias transformer.h.2.mlp.fc_in.bias transformer.h.26.ln_1.weight transformer.h.8.ln_1.bias
14196 14348 14475 14627 14754 14906 gpt-j.onnx transformer.h.14.ln_1.bias transformer.h.2.mlp.fc_out.bias transformer.h.26.mlp.fc_in.bias transformer.h.8.ln_1.weight
14197 14349 14476 14628 14755 14907 lm_head.bias transformer.h.14.ln_1.weight transformer.h.20.ln_1.bias transformer.h.26.mlp.fc_out.bias transformer.h.8.mlp.fc_in.bias
14223 14350 14502 14629 14781 14908 transformer.h.0.attn.bias transformer.h.14.mlp.fc_in.bias transformer.h.20.ln_1.weight transformer.h.27.ln_1.bias transformer.h.8.mlp.fc_out.bias
14224 14351 14503 14630 14782 14909 transformer.h.0.ln_1.bias transformer.h.14.mlp.fc_out.bias transformer.h.20.mlp.fc_in.bias transformer.h.27.ln_1.weight transformer.h.9.ln_1.bias
14225 14352 14504 14631 14783 14910 transformer.h.0.ln_1.weight transformer.h.15.ln_1.bias transformer.h.20.mlp.fc_out.bias transformer.h.27.mlp.fc_in.bias transformer.h.9.ln_1.weight
14226 14378 14505 14657 14784 14936 transformer.h.0.mlp.fc_in.bias transformer.h.15.ln_1.weight transformer.h.21.ln_1.bias transformer.h.27.mlp.fc_out.bias transformer.h.9.mlp.fc_in.bias
14227 14379 14506 14658 14785 14937 transformer.h.0.mlp.fc_out.bias transformer.h.15.mlp.fc_in.bias transformer.h.21.ln_1.weight transformer.h.3.ln_1.bias transformer.h.9.mlp.fc_out.bias
14228 14380 14507 14659 14786 14938 transformer.h.1.ln_1.bias transformer.h.15.mlp.fc_out.bias transformer.h.21.mlp.fc_in.bias transformer.h.3.ln_1.weight transformer.ln_f.bias
14254 14381 14533 14660 14812 14939 transformer.h.1.ln_1.weight transformer.h.16.ln_1.bias transformer.h.21.mlp.fc_out.bias transformer.h.3.mlp.fc_in.bias transformer.ln_f.weight
14255 14382 14534 14661 14813 14940 transformer.h.1.mlp.fc_in.bias transformer.h.16.ln_1.weight transformer.h.22.ln_1.bias transformer.h.3.mlp.fc_out.bias transformer.wte.weight
14256 14383 14535 14662 14814 14941 transformer.h.1.mlp.fc_out.bias transformer.h.16.mlp.fc_in.bias transformer.h.22.ln_1.weight transformer.h.4.ln_1.bias
14257 14409 14536 14688 14815 14967 transformer.h.10.ln_1.bias transformer.h.16.mlp.fc_out.bias transformer.h.22.mlp.fc_in.bias transformer.h.4.ln_1.weight
14258 14410 14537 14689 14816 14968 transformer.h.10.ln_1.weight transformer.h.17.ln_1.bias transformer.h.22.mlp.fc_out.bias transformer.h.4.mlp.fc_in.bias
14259 14411 14538 14690 14817 14969 transformer.h.10.mlp.fc_in.bias transformer.h.17.ln_1.weight transformer.h.23.ln_1.bias transformer.h.4.mlp.fc_out.bias
14285 14412 14564 14691 14843 14970 transformer.h.10.mlp.fc_out.bias transformer.h.17.mlp.fc_in.bias transformer.h.23.ln_1.weight transformer.h.5.ln_1.bias
14286 14413 14565 14692 14844 14971 transformer.h.11.ln_1.bias transformer.h.17.mlp.fc_out.bias transformer.h.23.mlp.fc_in.bias transformer.h.5.ln_1.weight
14287 14414 14566 14693 14845 14972 transformer.h.11.ln_1.weight transformer.h.18.ln_1.bias transformer.h.23.mlp.fc_out.bias transformer.h.5.mlp.fc_in.bias
14288 14440 14567 14719 14846 14998 transformer.h.11.mlp.fc_in.bias transformer.h.18.ln_1.weight transformer.h.24.ln_1.bias transformer.h.5.mlp.fc_out.bias
14289 14441 14568 14720 14847 14999 transformer.h.11.mlp.fc_out.bias transformer.h.18.mlp.fc_in.bias transformer.h.24.ln_1.weight transformer.h.6.ln_1.bias
14290 14442 14569 14721 14848 15000 transformer.h.12.ln_1.bias transformer.h.18.mlp.fc_out.bias transformer.h.24.mlp.fc_in.bias transformer.h.6.ln_1.weight
Thanks for raising the error! cc @michaelbenayoun and @lewtun for knowledge
I was able to reproduce this behaviour but am not entirely sure (yet) what is causing the ONNX export to generate so many files (there should only be a single .onnx
file in the output).
My current best guess is that it is something peculiar with the torch.onnx.export
function that we call internally in transformers.onnx
. One possibility is that the sheer size of the model is causing a problem with protocol buffers (until recently it was only possible to export 2GB-sized models). Some more investigation is needed to figure this out, and I'll report back here when I have a better insight.
Incidentally, @shimoshida were you able to use your gpt-j.onnx
model in ONNX Runtime? I'm curious whether the extra files are harmless or signal a deeper problem with the export.
@lewtun Thank you for your reply.
I also try to call torch.onnx.export
function directly, but the result is the same as the above one. The script is here.
I have tested loading gpt-j.onnx
by using ONNXRuntime, and then the following error is obtained:
Traceback (most recent call last):
File "runtime_test.py", line 5, in <module>
ort_sess = ort.InferenceSession('outputs/gpt-j.onnx')
File "/opt/conda/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 335, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/opt/conda/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 368, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from outputs/gpt-j.onnx failed:Type Error: Type parameter (T) of Optype (Einsum) bound to different types (tensor(int64) and tensor(float) in node (Einsum_110).
One possibility is that the sheer size of the model is causing a problem with protocol buffers (until recently it was only possible to export 2GB-sized models)
Oh, I didn't know about that limitation until now... If so, I should raise this issue in the torch repository.
Thanks for testing the model with ONNX Runtime @shimoshida!
Oh, I didn't know about that limitation until now... If so, I should raise this issue in the torch repository. I think the limitation is actually on the
onnx
side (which is used bytorch.onnx
). For example, here's an issue where someone tries to export a >2GB sized model.
I tracked down the onnx
PR where support for large models was introduced, and one can see the potentially relevant comment:
We need a method for optionally storing tensor data in separate files, which can be loaded on demand.
So my current understanding is that the multiple file export is expected for models like GPT-J, but that raises the question on how this data should be ingested in ONNX Runtime. I'll take another look at this and report back!
Hi @shimoshida here's a summary of what I think is going on:
torch.onnx.export()
function (docs) you can see there's a use_external_data_format
argument. This argument is True
for GPT-J when using the transformers.onnx
package, as you can see here.On the ONNX side, I'm able to load the model and also check that it was exported correctly via
import onnx
# Check we can load the model
onnx_model = onnx.load('model.onnx')
# Check the model
onnx.checker.check_model('model.onnx', full_check=True)
opset
, since Einsum
has been available since opset=12
I suggest opening an issue on the ONNX Runtime repo and see whether they can provide some further advice.
@lewtun Thank you for sharing information !
I suggest opening an issue on the ONNX Runtime repo and see whether they can provide some further advice.
Sure. I've asked a question and will wait for an answer. https://github.com/microsoft/onnxruntime/discussions/10121
Hi @shimoshida it seems that the root cause of the problem was due to a mismatch in the einsum
types: https://github.com/microsoft/onnxruntime/discussions/10121#discussioncomment-1948951
Does that proposal solve the issue for you?
@lewtun I'm sorry for the late reply. I have tested using the proposal, but I have encountered the following problem: https://github.com/microsoft/onnxruntime/discussions/10121#discussioncomment-1987845
However, the problem seems not relevant to transformers, I have closed this issue. Thank you for your help!
Thank you for the reply @shimoshida ! It looks like a mismatch between the ops in the original and traced models at runtime, but you're right that the ONNX export itself seems to be OK.
Environment info
transformers
version: 4.14.1PyTorch
version:Information
I want to convert GPT-J model(https://huggingface.co/NovelAI/genji-jp) to onnx file, but I have a trouble with the conversion by using the following scripts.
To reproduce
Steps to reproduce the behavior:
nvcr.io/nvidia/pytorch:21.11-py3
asLog is as follows:
Expected behavior
onnx file is generated successfully.