fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.18k stars 388 forks source link

Error changing bit width precision for an RNN #825

Closed olaiya closed 1 year ago

olaiya commented 1 year ago

Hello

I'm using hls4ml of the main branch. Looking at hls4ml.version I get 0.8.0.dev11+gabaea98a

I'm looking at converting an RNN into vhdl using hls4ml. It works, but when I try and change the the bit width of the RNN precision I get errors. I have generated an example in github to demonstrate the problem.

The example can be found here:

https://github.com/olaiya/rnn_hls4ml/blob/master/rnn_hls4ml.py

If I uncomment line 106, I get errors (error snippet attached at the end).

Is this a problem, or am I setting the precision incorrectly?

You can simply run the example with

git clone https://github.com/olaiya/rnn_hls4ml.git

cd rnn_hls4ml/

python3 rnn_hls4ml.py

Thanks

Manny

In file included from firmware/parameters.h:15, from firmware/myproject.cpp:4: firmware/nnet_utils/nnet_recurrent.h: In instantiation of ‘void nnet::gru_static(bool, data_T, res_T, typename CONFIG_T::weight_t, typename CONFIG_T::weight_t, typename CONFIG_T::bias_t, typename CONFIG_T::bias_t) [with data_T = ap_fixed<16, 6>; res_T = ap_fixed<16, 6>; CONFIG_T = config2; typename CONFIG_T::weight_t = ap_fixed<16, 6>; typename CONFIG_T::bias_t = ap_fixed<16, 6>]’: firmware/nnet_utils/nnet_recurrent.h:539:96: required from ‘void nnet::gru_stack(hls::stream&, hls::stream&, typename CONFIG_T::weight_t, typename CONFIG_T::weight_t, typename CONFIG_T::bias_t, typename CONFIG_T::bias_t) [with data_T = nnet::array<ap_fixed<16, 6>, 1>; res_T = nnet::array<ap_fixed<16, 6>, 20>; CONFIG_T = config2; typename CONFIG_T::weight_t = ap_fixed<16, 6>; typename CONFIG_T::bias_t = ap_fixed<16, 6>]’ firmware/myproject.cpp:41:88: required from here firmware/nnet_utils/nnet_recurrent.h:440:100: error: cannot convert ‘config2::accum_t [40]’ {aka ‘ap_fixed<48, 24> [40]’} to ‘ap_fixed<16, 6>*’ 440 | typename CONFIG_T::ACT_CONFIG_GRU>::activation(inputacc_zr, tmpres_zr); | ^~~~~

......

bo3z commented 1 year ago

To change the precision, it can be done in the config_from_keras_model function, as: hls_config_reg = hls4ml.utils.config_from_keras_model(model, granularity='name', default_precision='ap_fixed<48,24>)

Then, the commented out line hls_config_reg['Model']['Precision'] = 'ap_fixed<48,24>', can be safely ignored, while producing the same effect. However, keep in mind that this is quite a high precision - higher than standard 32-bit floating point.

If you want to tune for individual layers afterwards, it could be done as hls_config_reg['LayerName'][some_layer_name]['Precision'][some_variable] = ....

Two more hints for hls4ml - to my understanding, the parallelization factor is only applicable to parallel CNNs, so it would make no difference here. Streaming I/O is particularly useful for CNNs as well; for RNNs, while supported, I think I/O parallel is the preferred I/O type - I am not sure about the RTL differences exactly, but I don't think many resources would be saved with streaming RNNs - worth trying both.

olaiya commented 1 year ago

@bo3z

Thank you very much for you quick reply. Using hls4ml.utils.config_from_keras_model to set the precision fixed my problem. Also thanks for the RNN configuration suggestions.

Manny