microsoft / onnxjs

ONNX.js: run ONNX models using JavaScript
Other
1.75k stars 130 forks source link

Tix/concat packed #272

Closed Tixxx closed 3 years ago

Tixxx commented 3 years ago

packed version of concat

xzhu1900 commented 3 years ago

Shall we also add this packed concat kernel into onnxjs\lib\backends\webgl\op-resolve-rules.ts, so model contains concat op can be resolved with this kernel?

xzhu1900 commented 3 years ago

Consider adding op test or model test, which creates a onnx model on the fly with a specific op (in this case concat). The benefit is that, then we have an end-to-end test with chained pack-concat-unpack executed.

Tixxx commented 3 years ago

Shall we also add this packed concat kernel into onnxjs\lib\backends\webgl\op-resolve-rules.ts, so model contains concat op can be resolved with this kernel?

I was hesitant to do so since we now no longer have 1-to-1 mapping from onnx op to webgl kernel. I changed the old kernel api to be a wrapper of 2 different implementations, in that way we can dynamically switch to packed or unpacked depending on the inputs, pack mode and etc.. Since it now keeps 2 versions of the op impl, the memory overhead is unknown. will need to check that.

duli2012 commented 3 years ago

Shall we also add this packed concat kernel into onnxjs\lib\backends\webgl\op-resolve-rules.ts, so model contains concat op can be resolved with this kernel?

I was hesitant to do so since we now no longer have 1-to-1 mapping from onnx op to webgl kernel. I changed the old kernel api to be a wrapper of 2 different implementations, in that way we can dynamically switch to packed or unpacked depending on the inputs, pack mode and etc.. Since it now keeps 2 versions of the op impl, the memory overhead is unknown. will need to check that.

The dynamic approach offers more flexibility at the expense of runtime overhead (which should be negligible in most cases though). The key is whether there're scenarios in which runtime info like inputs is required to decide which kernel to pick. Or, on the contrary, static info like kernel attributes is enough. For the latter, the kernel selection can be done at model loading time, e.g. in op-resolve-rules.ts.

Tixxx commented 3 years ago

Shall we also add this packed concat kernel into onnxjs\lib\backends\webgl\op-resolve-rules.ts, so model contains concat op can be resolved with this kernel?

I was hesitant to do so since we now no longer have 1-to-1 mapping from onnx op to webgl kernel. I changed the old kernel api to be a wrapper of 2 different implementations, in that way we can dynamically switch to packed or unpacked depending on the inputs, pack mode and etc.. Since it now keeps 2 versions of the op impl, the memory overhead is unknown. will need to check that.

The dynamic approach offers more flexibility at the expense of runtime overhead (which should be negligible in most cases though). The key is whether there're scenarios in which runtime info like inputs is required to decide which kernel to pick. Or, on the contrary, static info like kernel attributes is enough. For the latter, the kernel selection can be done at model loading time, e.g. in op-resolve-rules.ts.

Yea the memory overhead is very small in this case since the Operator class is very light-weight, and there's no difference in runtime perf. I think it's inevitable to have a differentiation based on input size, i haven't measured it in-depth myself. For instance, TFJS doesn't use the packed concat implementation for 1d tensors so I'm using the similar logic here for concat.