fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.24k stars 402 forks source link

Conv layer changes other layers to Resource unless specifically set #759

Open jmitrevs opened 1 year ago

jmitrevs commented 1 year ago

Quick summary

If you have a conv layer in your network, setting the Strategy at the top level is overwritten after the first Conv layer. If you want to have all the nodes be implemented with the Latency strategy,

hls_config["Model"]["Strategy"] = "Latency"

is not enough. You have to have:

for layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][layer]['Strategy'] = "Latency"

Details

I believe this is because Conv requires the dataflow style at the top level. The Conv layer changes the setup to dataflow but inadvertently changes all the layers not explicitly configured to "Resource". The cleanest solution may be to split the dataflow style from layer Resource/Latency strategies.

Steps to Reproduce

A simple setup is to look at test_binary_cnn.py (from #749) and change the strategy to Latency. You can see that only the first CNN is Latency, the others are Resource, in the parameters.h.

Optional

Possible fix

The cleanest solution may be to split the dataflow style from layer Resource/Latency strategies.

vandenBergArthur commented 1 year ago

Hi, I have a quick question related to this. When I create my HLS model with hls4ml.converters.keras_to_hls, I get the following warning: WARNING: Cannot use "Latency" model strategy for conv2d layer. Switching to "Resource" strategy.

Is this a bug? If not: why can't I use Latency strategy?

jmitrevs commented 1 year ago

You can get rid of that warning by having

hls_config["Model"]["Strategy"] = "Resource"
for layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][layer]['Strategy'] = "Latency"

Then you can use Latency for the layers but there needs to be a dataflow construct at the top level. Dataflow has traditionally been associated with Resource. We will make this less confusing in the future. It is a bug that you cannot just say hls_config["Model"]["Strategy"] = "Latency"

vandenBergArthur commented 1 year ago

Thanks for the reply @jmitrevs! The warning is gone. However, when compiling my HLS model I'm having weird errors that I've not seen before.

In file included from firmware/myproject.cpp:22:
firmware/parameters.h:27:44: error: ‘Latency’ is not a member of ‘nnet’; did you mean ‘latency’?
   27 |     static const unsigned strategy = nnet::Latency;
      |                                            ^~~~~~~
      |                                            latency
firmware/parameters.h:58:44: error: ‘Latency’ is not a member of ‘nnet’; did you mean ‘latency’?
   58 |     static const unsigned strategy = nnet::Latency;
      |                                            ^~~~~~~
      |                                            latency
firmware/myproject.cpp: In function ‘void myproject(hls::stream<nnet::array<ap_fixed<16, 6>, 25> >&, hls::stream<nnet::array<ap_fixed<16, 6>, 25> >&)’:
firmware/myproject.cpp:53:11: error: ‘pointwise_conv_2d_cf’ is not a member of ‘nnet’; did you mean ‘pointwise_conv_2d_cl’?
   53 |     nnet::pointwise_conv_2d_cf<input_t, layer8_t, config8>(input1, layer8_out, w8, b8); // conv1
      |           ^~~~~~~~~~~~~~~~~~~~
      |           pointwise_conv_2d_cl
firmware/myproject.cpp:53:39: error: expected primary-expression before ‘,’ token
   53 |     nnet::pointwise_conv_2d_cf<input_t, layer8_t, config8>(input1, layer8_out, w8, b8); // conv1
      |                                       ^
firmware/myproject.cpp:53:49: error: expected primary-expression before ‘,’ token
   53 |     nnet::pointwise_conv_2d_cf<input_t, layer8_t, config8>(input1, layer8_out, w8, b8); // conv1
      |                                                 ^
firmware/myproject.cpp:53:58: error: expected primary-expression before ‘>’ token
   53 |     nnet::pointwise_conv_2d_cf<input_t, layer8_t, config8>(input1, layer8_out, w8, b8); // conv1
      |                                                          ^
g++: error: myproject.o: No such file or directory

Is dataformat = channels_first no longer supported in the conv2d layer? I noticed that the nnet files have been changed recently. (I am using main branch.)

vandenBergArthur commented 1 year ago

Hi, I'm having the same issue with Dense layers. WARNING: Cannot use "Latency" model strategy for dense1 layer. Switching to "Resource" strategy. So this bug might not be limited to convolution layers.

jmitrevs commented 1 year ago

Generally channels-first is not recommended. With pytorch parsing via onnx (currently in a PR that will be revised before merging, but still usable), we convert to channels-last as a part of the parsing step, and I believe direct pytorch in the future (also in a PR) will also do the same. (The pytorch and onnx parsers in main are fairly old.) Nevertheless, I will look into the capitalization issue.