Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.45k stars 626 forks source link

TF2 MatMul unknown in inspect & compile #1276

Closed tafk7 closed 1 year ago

tafk7 commented 1 year ago

Although tf.linalg.matmul can now be quantized addressing this issue, it is not properly identified in compilation. There exists a dedicated xnode class for matmuls that cites tf.linalg.matmul as a function it translates (XModelNodeMatMul), however the compiler consistently fails to link it and instead sets it as an unknown node, as you can see from the error below (the final error is on line 7090 inside of the XModelNodeUnknown class). Dense/Linear layers still quantize and compile as expected.

vitis-ai-user@weave:/workspace/matmul_example/tf$ vai_c_tensorflow2 -m models/t1_Q.h5 -a /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU104/arch.json  -o models -n Model
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
[INFO] Namespace(batchsize=1, inputs_shape=None, layout='NHWC', model_files=['models/t1_Q.h5'], model_type='tensorflow2', named_inputs_shape=None, out_filename='/tmp/Model_DPUCZDX8G_ISA1_B4096_org.xmodel', proto=None)
[INFO] tensorflow2 model: /workspace/matmul_example/tf/models/t1_Q.h5
[INFO] keras version: 2.12.0
[INFO] Tensorflow Keras model type: functional
[INFO] parse raw model     :100%|█████████████████████████████████████████████████████| 7/7 [00:00<00:00, 23677.52it/s]                 
[INFO] infer shape (NHWC)  : 36%|███████████████████▎                                 | 4/11 [00:00<00:00, 42581.77it/s]                
Traceback (most recent call last):
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/bin/xnnc-run", line 33, in <module>
    sys.exit(load_entry_point('xnnc==3.5.0', 'console_scripts', 'xnnc-run')())
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/__main__.py", line 49, in main
    runner.normal_run(args)
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/runner.py", line 116, in normal_run
    XConverter.run(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/xconverter.py", line 144, in run
    xmodel = CORE.make_xmodel(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/core.py", line 118, in make_xmodel
    xmodel = translator.to_xmodel(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/translator/tensorflow_translator.py", line 103, in to_xmodel
    xmodel = cls.create_xmodel(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/translator/tensorflow_translator.py", line 179, in create_xmodel
    xmodel = cls.__create_xmodel_from_tf2(
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/translator/tensorflow_translator.py", line 2432, in __create_xmodel_from_tf2
    if not xmodel.infer_shape(Layout.NHWC):
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/ir/xmodel.py", line 507, in infer_shape
    ok, error = self.__do_shape_inference(Layout[self.layout])
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/ir/xmodel.py", line 532, in __do_shape_inference
    xnode.infer_shape(layout)
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/xnnc/ir/xnode.py", line 7090, in infer_shape
    assert 'shape' in self.tmp_params, f"'{self.op_name}' is an unknown op that requires a specific shape. Please provide shape information."
AssertionError: 'tf.linalg.matmul' is an unknown op that requires a specific shape. Please provide shape information`

I have tried all iterations and implementations of matmul inside of TF2 and all of them either fail to quantize or fail to compile as shown above. I get a similar error when attempting to Inspect the model before quantziation:

Traceback (most recent call last):
  File "test.py", line 102, in <module>
    main()
  File "test.py", line 79, in main
    tb.inspect(model, in_shape, model_path, target) 
  File "/workspace/matmul_example/tf/tb_tf.py", line 128, in inspect
    inspector.inspect_model(model, 
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/tensorflow_model_optimization/python/core/quantization/keras/vitis/vitis_inspect.py", line 649, in inspect_model
    inspect_results = self._extract_inspect_results(float_model,
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/tensorflow_model_optimization/python/core/quantization/keras/vitis/vitis_inspect.py", line 263, in _extract_inspect_results
    logger.error('Cannot find layer {}\'s final layer.'.format(layer.name))
  File "/opt/vitis_ai/conda/envs/vitis-ai-tensorflow2/lib/python3.8/site-packages/tensorflow_model_optimization/python/core/quantization/keras/vitis/utils/common_utils.py", line 75, in error
    raise err_type('[VAI ERROR] ' + msg)
ValueError: [VAI ERROR] Cannot find layer tf.linalg.matmul's final layer.

I noticed that inside of the hugging face BERT transformer example they instead use QuantTFOpLambda, which I attempted to use but had difficulty implementing.

If anyone has any advice how to implement the simple matrix multiply that the quantizer and compiler can detect and handle, it would be greatly appreciated.

zhenzhen-AMD commented 1 year ago

Hi @Rellek72 , Can you provide code and model files?

tafk7 commented 1 year ago

Of course, example below.

matmul_example.zip

I've also been discussing this issue on the Xilinx forums here.

zhenzhen-AMD commented 1 year ago

Hi @Rellek72 , The root cause of the issue is that xcompiler does not provide support for deploying tf.linalg.matmul to DPU.

tafk7 commented 1 year ago

@zhenzhen-AMD Could you please update the documentation accordingly? Matrix multiplies are listed as supported here and the user guide explicitly refers to matmul as a supported operation for tensorflow.

zhenzhen-AMD commented 1 year ago

Hi @Rellek72 ,

Thank you for sharing your model with us. After reviewing, it seems that your model doesn't satisfy the necessary criteria for converting a matmul operation into conv2d.

For a successful conversion from matmul to conv2d, one of the inputs to the matmul operation must be the input tensor (inputs), while the other should be the weight tensor (weights). However, in the model you've provided, both inputs to the matmul operation are input tensors. As a result, it cannot be deployed to DPU.

This is aligned with what's outlined in the documentation, as shown below: image

zhenzhen-AMD commented 1 year ago

Closing this for now. Please re-open if there are any other concerns. Thanks.

tafk7 commented 1 year ago

I greatly appreciate your help with this and assistance finding workarounds on the Xilinx forums. However, I would highly recommend that you amend the documentation to make it explicitly clear that matmuls with two dynamic inputs aren't supported, as that is a very significant use case of modern model architectures like the Transformer.

zhenzhen-AMD commented 1 year ago

Thank you for your feedback and strong suggestion. I've relayed your feedback to the relevant team.