TF2 MatMul unknown in inspect & compile

tafk7 commented 1 year ago

Although tf.linalg.matmul can now be quantized addressing this issue, it is not properly identified in compilation. There exists a dedicated xnode class for matmuls that cites tf.linalg.matmul as a function it translates (XModelNodeMatMul), however the compiler consistently fails to link it and instead sets it as an unknown node, as you can see from the error below (the final error is on line 7090 inside of the XModelNodeUnknown class). Dense/Linear layers still quantize and compile as expected.

vitis-ai-user@weave:/workspace/matmul_example/tf$ vai_c_tensorflow2 -m models/t1_Q.h5 -a /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU104/arch.json  -o models -n Model
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
[INFO] Namespace(batchsize=1, inputs_shape=None, layout='NHWC', model_files=['models/t1_Q.h5'], model_type='tensorflow2', named_inputs_shape=None, out_filename='/tmp/Model_DPUCZDX8G_ISA1_B4096_org.xmodel', proto=None)
[INFO] tensorflow2 model: /workspace/matmul_example/tf/models/t1_Q.h5
[INFO] keras version: 2.12.0
[INFO] Tensorflow Keras model type: functional
[INFO] parse raw model     :100%|█████████████████████████████████████████████████████| 7/7 [00:00<00:00, 23677.52it/s]                 
[INFO] infer shape (NHWC)  : 36%|███████████████████▎                                 | 4/11 [00:00<00:00, 42581.77it/s]                
Traceback (most recent call last):
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/bin/xnnc-run", line 33, in <module>
    sys.exit(load_entry_point('xnnc==3.5.0', 'console_scripts', 'xnnc-run')())
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/__main__.py", line 49, in main
    runner.normal_run(args)
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/runner.py", line 116, in normal_run
    XConverter.run(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/xconverter.py", line 144, in run
    xmodel = CORE.make_xmodel(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/core.py", line 118, in make_xmodel
    xmodel = translator.to_xmodel(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/translator/tensorflow_translator.py", line 103, in to_xmodel
    xmodel = cls.create_xmodel(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/translator/tensorflow_translator.py", line 179, in create_xmodel
    xmodel = cls.__create_xmodel_from_tf2(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/translator/tensorflow_translator.py", line 2432, in __create_xmodel_from_tf2
    if not xmodel.infer_shape(Layout.NHWC):
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/ir/xmodel.py", line 507, in infer_shape
    ok, error = self.__do_shape_inference(Layout[self.layout])
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/ir/xmodel.py", line 532, in __do_shape_inference
    xnode.infer_shape(layout)
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/ir/xnode.py", line 7090, in infer_shape
    assert 'shape' in self.tmp_params, f"'{self.op_name}' is an unknown op that requires a specific shape. Please provide shape information."
AssertionError: 'tf.linalg.matmul' is an unknown op that requires a specific shape. Please provide shape information`

I have tried all iterations and implementations of matmul inside of TF2 and all of them either fail to quantize or fail to compile as shown above. I get a similar error when attempting to Inspect the model before quantziation:

Traceback (most recent call last):
  File "test.py", line 102, in <module>
    main()
  File "test.py", line 79, in main
    tb.inspect(model, in_shape, model_path, target) 
  File "/workspace/matmul_example/tf/tb_tf.py", line 128, in inspect
    inspector.inspect_model(model, 
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/tensorflow_model_optimization/python/core/quantization/keras/vitis/vitis_inspect.py", line 649, in inspect_model
    inspect_results = self._extract_inspect_results(float_model,
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/tensorflow_model_optimization/python/core/quantization/keras/vitis/vitis_inspect.py", line 263, in _extract_inspect_results
    logger.error('Cannot find layer {}\'s final layer.'.format(layer.name))
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/tensorflow_model_optimization/python/core/quantization/keras/vitis/utils/common_utils.py", line 75, in error
    raise err_type('[VAI ERROR] ' + msg)
ValueError: [VAI ERROR] Cannot find layer tf.linalg.matmul's final layer.

I noticed that inside of the hugging face BERT transformer example they instead use QuantTFOpLambda, which I attempted to use but had difficulty implementing.

If anyone has any advice how to implement the simple matrix multiply that the quantizer and compiler can detect and handle, it would be greatly appreciated.