kraiskil / onnx2c

Open Neural Network Exchange to C compiler.
Other
184 stars 30 forks source link

node operation torch_nn_modules_conv_Conv2d_conv1_1 #46

Open satabios opened 2 months ago

satabios commented 2 months ago

I'm trying to convert a simple model from onnx2c. I have attached the architecture here: image

Upon compilation, I get [Fatal] (createNode) Unimplemented: node operation torch_nn_modules_conv_Conv2d_conv1_1

What am I doing wrong here? Any suggestions would be appreciated, thanks in advance.

kraiskil commented 2 months ago

This sounds strange. That error should come if onnx2c doesn't find a requested node/operand type. But here it prints out a layer's name, not its type.

Could the .onnx file be badly formed. Can you try reading it with a few tools out there (other than netron, which seems to work). Or paste the file here?

null77 commented 1 month ago

I'm getting the same error using the latest pytorch+onnx tutorials, for all my simple models. I'm trying to determine something with the libraries or the export has changed, but it's slow because of my unfamiliarity with the formats. Internally I see there's a "function" definition that has the funky name plus the real type inside of it:

functions {
  name: "torch_nn_modules_linear_Linear_fc_0_1"
  input: "l_x_"
  input: "fc.0.weight"
  input: "fc.0.bias"
  output: "addmm"
  node {
    input: "fc.0.weight"
    output: "t"
    name: "aten_t_0"
    op_type: "aten_t"
    domain: "pkg.onnxscript.torch_lib"
    doc_string: ""
  }
  node {
    input: "fc.0.bias"
    input: "l_x_"
    input: "t"
    output: "addmm"
    name: "aten_addmm_1"
    op_type: "aten_addmm"
    domain: "pkg.onnxscript.torch_lib"
    attribute {
      name: "alpha"
      type: FLOAT
      f: 1
    }
    attribute {
      name: "beta"
      type: FLOAT
      f: 1
    }
    doc_string: ""
  }
  opset_import {
    version: 18
  }
  opset_import {
    domain: "pkg.onnxscript.torch_lib"
    version: 1
  }
  domain: "pkg.torch.2.3.0+cu118"
}

And below the network node has the funky name and a similar but different op_type:

  node {
    input: "l_x_"
    input: "fc.0.weight"
    input: "fc.0.bias"
    output: "fc_0_1"
    name: "torch_nn_modules_linear_Linear_fc_0_1_0"
    op_type: "torch_nn_modules_linear_Linear_fc_0_1"
    domain: "pkg.torch.2.3.0+cu118"
    doc_string: ""
  }
null77 commented 1 month ago

hello_world.onnx.gz attached the example I was looking at.

kraiskil commented 1 month ago

This really sounds like the pytorch onnx export is buggy.

@null77: the listing you printed - is that the pytorch model, or the exported onnx file? Could you list the exact steps you took to create the model and print it - maybe that would give a hint on what is happening here.

null77 commented 1 month ago

@kraiskil it could be, I don't have the expertise to tell you if it's a bug in the export. I was following the pytorch example here:

https://pytorch.org/tutorials/beginner/onnx/export_simple_model_to_onnx_tutorial.html

You can follow the same steps as their simple model, produce the exported onnx file and attempt to process it with onnx2c. Happy to give you the full repro if you need it tomorrow when I'm back at my PC.

kraiskil commented 1 month ago

Thanks for the link. Following those steps, I guess you are using the new pytorch exporter?

torch.onnx.dynamo_export is the newest (still in beta) exporter based on the TorchDynamo technology released with PyTorch 2.0

Seems this uses a feature of onnx that never popped up before - functions: https://onnx.ai/onnx/intro/concepts.html#functions

These functions are re-usable onnx graph snippets, and seems the TorchDynamo exporter is very happy about using them. Seems both examples above define everything as a Function, even if the function then itself contains just a single Operand.

As a quick fix, try using the old exporter mentioned in the tutorial, or maybe run the network through an onnx simplifier.

The proper fix is to handle these kind of Functions in onnx2c, but that fix is probably not as quick...

null77 commented 1 month ago

Awesome, thanks for the update & suggested workaround.