daquexian / onnx-simplifier

Simplify your onnx model
Apache License 2.0
3.83k stars 382 forks source link

[BUG] Segmentation Fautl #187

Open EthanZhangYi opened 2 years ago

EthanZhangYi commented 2 years ago

Describe the bug A error Segmentation fault (core dumped) is raised when using onnxsim.

Environment

Model A test script is provided.

import onnx
import torch
from onnxsim import simplify
from torch.nn import Transformer

def main():
    # ----------- Model -----------
    model = Transformer(
        d_model=256,
        nhead=8,
        num_encoder_layers=3,
        num_decoder_layers=3,
        dim_feedforward=1024,
        dropout=0.1)
    model.to('cpu').to(torch.float32)
    model.eval()

    # ----------- In/Out -----------
    input_names = ['src', 'tgt']
    input_data = [
        torch.rand((100, 2, 256), dtype=torch.float32),
        torch.rand((15, 2, 256), dtype=torch.float32),
    ]
    output_names = ['out']
    out = 'transformer.onnx'
    out_sim = 'transformer.sim.onnx'

    # ----------- Export -----------
    with torch.no_grad():
        torch.onnx.export(
            model,
            tuple(input_data),
            out,
            input_names=input_names,
            output_names=output_names,
            opset_version=11,
            do_constant_folding=False)

    # ----------- Simplify -----------
    onnx_model = onnx.load(out)
    onnx_model_sim, check = simplify(
        onnx_model,
        skipped_optimizers=['extract_constant_to_initializer'],
    )
    assert check, f"Simplified {out} failed."
    onnx.save(onnx_model_sim, out_sim)

if __name__ == '__main__':
    main()

If we use skipped_optimizers=['extract_constant_to_initializer'], segmentation fault is raised. If we comment this line, everything is OK. However, in my project, this line is needed.

EthanZhangYi commented 2 years ago

@daquexian Could you please give some help?

EthanZhangYi commented 2 years ago

When I downgrade the pkg version to

No error is raised.

SolomidHero commented 2 years ago

same problem

onnx==1.12.0
onnx-simplifier==0.4.0
onnxconverter-common @ git+https://github.com/microsoft/onnxconverter-common@0a401de9ee410bf3f65fb3dd3d13d4eab7e91a10
onnxmltools==1.11.1
onnxruntime==1.11.1
onnxsim-no-ort==0.4.0
skl2onnx==1.11.2
tf2onnx==1.11.1
ahirner commented 2 years ago

In my case MemoryError: std::bad_alloc and segmentation faults depend on the opset. Version 11 faults while 10 not. Opset version 11 also introduced dynamic shape capabilities.

onnx==1.11.0
onnx-simplifier==0.3.10
onnxoptimizer==0.2.7
onnxruntime==1.11.1
onnxsim==0.4.7
onnxsim-no-ort==0.4.0
travisjayday commented 2 years ago

thanks @ahirner , having the same issue. Setting opset=10, does not crash but opset=11, crashes with bad_alloc

bas-aarts commented 1 year ago

same observation here: bad_alloc with opset >= 11, when using the python API (onnxsim.simpify) strange thing is that when I use the command line tool (onnxsim), the model optimizes just fine

bas-aarts commented 1 year ago

I also observed the the error is nondeterministic. Looking at onnx2f, is seems like they cam e to the same conclusion:https://github.com/PINTO0309/onnx2tf/blob/794bd8699b55ea74c09411fd1d077448e02954aa/onnx2tf/onnx2tf.py#L505-L536

this looks like a serious issue @daquexian.

SolomidHero commented 1 year ago

see small example i provided there #206