[Feature] Separate graph rewriting and constant folding

daquexian commented 4 years ago

For op fusion (like the fusion of conv and bn), we have implemented a "small onnxruntime" in tensor.h. It increases the workload (the more fusion we want to do, the more op we need to implement), and brings many problems (https://github.com/onnx/optimizer/issues/6, https://github.com/onnx/optimizer/issues/8, https://github.com/onnx/onnx/issues/2677). However, as we know, onnx itself is not designed to infer onnx ops. It is unwise to take the effort to maintain an "embedded runtime" in the presence of onnxruntime.

In my opinion, we should drop the "embedded runtime". Instead, we should only rewrite the graph, and then call onnxruntime library to fold the constants. In this way, we will not need tensor.h or another tensor library in optimizer anymore.

For example, to fuse Add(Add(x, 1), 2), instead of calculating the result of Add(1, 2) in onnx-optimizer itself, we can just rewrite the graph to Add(x, Add(1, 2)), and call onnxruntime to fold Add(1, 2) to 3.

It is also the way of tensorflow built-in optimizer.

daquexian commented 4 years ago

@fumihwh @linkerzhang @askhade What's your opinion on it? Thanks!

linkerzhang commented 4 years ago

I agree. That means for optimization like "constant folding" should not be part of ONNX repos.

onnx / optimizer

[Feature] Separate graph rewriting and constant folding #9