apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators
https://tvm.apache.org/
Apache License 2.0
11.76k stars 3.47k forks source link

[Frontend][TFLite] Batch_MatMul op support #11881

Open rafzi opened 2 years ago

rafzi commented 2 years ago

I was working on adding support for the TFLite op Batch_MatMul. When looking at the tests, I was a bit puzzled as to why there was already a test for that operator and why it is passing.

https://github.com/apache/tvm/blob/main/tests/python/frontend/tflite/test_forward.py#L726

When I'm trying a model ( https://github.com/huggingface/tflite-android-transformers/blob/master/models_generation/gpt2.py using distilgpt2 ) with the op, I get the expected error:

  File "tvm/python/tvm/relay/frontend/tflite.py", line 279, in check_unsupported_ops
    raise tvm.error.OpNotImplemented(raise_msg)
tvm.error.OpNotImplemented: The following operators are not supported in frontend TFLite: 'BATCH_MATMUL'

The whole TFLite frontend test file is not quite clear to me. How would I then test that my new operator is converted correctly? It would be great if someone could clarify or add documentation.

@FrozenGene @u99127 @siju-samuel

u99127 commented 2 years ago

I'd suggest saving / writing out the tflite model to inspect what actually gets generated in this particular case from the use of math_ops.matmul in the framework. I suspect that produces a matmul operator rather than a batch_matmul operator in the framework , however producing it from the test might be the only way of looking at what is actually being consumed by the frontend .

That certainly looks puzzling.

u99127 commented 2 years ago

Ah, came in with this https://github.com/apache/tvm/pull/5510. Perhaps that helps in archeology and identifying the possibly misnamed tests.

rafzi commented 2 years ago

Yes, you are right. The models produced by the test do not contain the operator.

As far as I can tell, the math_ops.matmul function should produce a batch_matmul for certain parameters. I'll try to figure the specifics out. Thank you!

image

rafzi commented 2 years ago

The following transformation seems to unroll all BatchMatMul ops to MatMul:

https://github.com/tensorflow/tensorflow/blob/f72dafde88e0c32bc64144cdacc45b7b46d3c914/tensorflow/lite/toco/graph_transformations/unroll_batch_matmul.cc#L136

Using the MLIR converter switches from Slice to Split, but still does the unrolling:

img

So as far as I can tell it does not seem possible to generate the BatchMatMul op using the current converter. Maybe the model I referred to was created with an earlier version that did not employ this transformation?