Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.42k stars 621 forks source link

Model Inspection Vitis AI V3.0 #1310

Open IkrameBeggar opened 11 months ago

IkrameBeggar commented 11 months ago

I just run model inspection in vitis ai v3.0 as mentioned in the user guide https://docs.xilinx.com/r/en-US/ug1414-vitis-ai/Inspecting-the-Float-Model. However, I am facing some difficulties to understand the report. Can anyone help with that? Thank you in advance I have gotten this in the report: nndct_strided_slice can't be assigned to DPU, nndct_leaky_relu can't be assigned to DPU Screenshot from 2023-08-10 16-08-11

mounirouadi commented 11 months ago

The error messages you're encountering, specifically "nndct_strided_slice can't be assigned to DPU" and "nndct_leaky_relu can't be assigned to DPU," seem to be related to the deployment of your neural network model to the DPU (Deep Learning Processing Unit) using the Vitis AI tools. These errors indicate that certain layers in your model are not compatible with the DPU hardware acceleration and cannot be assigned to it.

Here's what each message means and potential steps to address the issue:

nndct_strided_slice can't be assigned to DPU:
The "nndct_strided_slice" layer might refer to a specific operation in your neural network that involves strided slicing of data. Strided slicing might not have a direct equivalent operation that can be efficiently mapped onto the DPU.

Solution: You could consider using alternative techniques to achieve the same result as the strided slice operation. If this operation is crucial to your model and can't be easily replaced, you might need to modify your model architecture or investigate whether the DPU supports custom layer extensions.

nndct_leaky_relu can't be assigned to DPU:
The "nndct_leaky_relu" layer seems to be a leaky ReLU activation function. Certain activation functions might not have direct hardware support on the DPU.

Solution: Consider using a different activation function that is supported by the DPU. Leaky ReLU is often used to mitigate the vanishing gradient problem, so you could try using other activation functions like ReLU, sigmoid, or tanh.

In general, when you encounter these types of errors during model deployment on specialized hardware like the DPU, you have a few options:

Layer Replacement: Replace unsupported layers or operations with equivalent supported operations. This might require retraining your model or modifying its architecture.

Custom Layer Extensions: If the unsupported layers are critical for your model and cannot be replaced easily, you could explore creating custom layer extensions that provide hardware-accelerated implementations for these operations on the DPU. This might involve writing custom code and integrating it with the Vitis AI framework.

Quantization and Optimization: Sometimes, model quantization and optimization techniques can help make your model more compatible with hardware accelerators like the DPU. These techniques can reduce the complexity of certain operations and improve hardware compatibility.

Check Hardware Constraints: Ensure that your model architecture and layer choices are in line with the hardware constraints and capabilities of the DPU. The DPU has specific limitations in terms of supported operations, data types, and memory usage.

Consult Documentation and Community: The Vitis AI documentation and community forums can be valuable resources for understanding these deployment issues and finding solutions. You might find guidance from developers who have faced similar challenges.
cchalou98 commented 11 months ago

Depending on the operations causing the issues, you may need to modify your model architecture. This could involve removing or replacing the unsupported operations with equivalent operations that are supported by the DPU.

IkrameBeggar commented 11 months ago

Depending on the operations causing the issues, you may need to modify your model architecture. This could involve removing or replacing the unsupported operations with equivalent operations that are supported by the DPU.

I tried but still getting the same error