fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.17k stars 388 forks source link

[Vitis hls] Cannot apply array transformation pragma/directive because of full array load/store. #805

Closed Duchstf closed 1 year ago

Duchstf commented 1 year ago

Quick summary

I was trying to convert a quantized Conv1D model using Vitis HLS, and was seeing this error:

ERROR: [HLS 214-384] in function 'void nnet::conv_1d_latency_cl<ap_fixed<9, 3, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>, config2>(ap_fixed<9, 3, (ap_q_mode)5, (ap_o_mode)3, 0>*, ap_fixed<16, 6, (ap_q_mode)5, (ap_o_mode)3, 0>*, config2::weight_t*, config2::bias_t*) (.304)': Cannot apply array transformation pragma/directive because of full array load/store. (firmware/nnet_utils/nnet_conv1d_latency.h:81:11)
ERROR: [HLS 200-1715] Encountered problem during source synthesis

Details

I'm using the current master hls4ml branch where the lastest commit is:

commit 2e71ff451cc36ce8e7319a92b65a5d9be8bff427 (HEAD -> main, origin/main, origin/HEAD)
Merge: eca1ea37 9d2e6640
Author: Jovan Mitrevski <jmitrevs@fnal.gov>
Date:   Tue May 16 13:55:19 2023 -0500

    Merge pull request #796 from fastmachinelearning/pre-commit-ci-update-config

Steps to Reproduce

I included the .h5 file of the model here.

And a script to convert the model here.

You could run the conversion using:

python hls_9bit_model.py

I'm not really sure what's happening here ...

jmduarte commented 1 year ago

@Duchstf which Vitis version are you using?

@vloncar may have some guidance on whether the current unrolled conv1d latency implementation is expected to work in Vitis.

I can try to take a look.

Duchstf commented 1 year ago

@jmduarte I was using the 2022.2 Vivado version. I was looking at the code and was wondering the same thing about whether the conv1d latency version is expected to work there.

If not I'll be happy to contribute. This is important for both the b-tagging and the LLP model we are developing in the L1 trigger.

Duchstf commented 1 year ago

I tested out #815 and it fixes this error. Here are the latency/resources reports if people care:

Latency:

Screen Shot 2023-06-19 at 11 11 19 AM

Resources:

Screen Shot 2023-06-19 at 11 12 00 AM

The II is still a bit high compared to what I expected, but this is subjected to further optimization. I will also try out what Javier proposes in #811 for the conv layers.

vloncar commented 1 year ago

The II is as expected, you need to increase the ParallelizationFactor of the two Conv layers to bring the II down.