fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.19k stars 390 forks source link

Problem Synthesizing Regression CNN #495

Open olaiya opened 2 years ago

olaiya commented 2 years ago

Hello

I am struggling to get good performance when I use hls4ml to synthesize my CNN which I use for regression. To demonstrate the problem, I took a reasonably simple CNN regression example from the web and tried to convert this with hls4ml without much success. I mean the MSE is really bad for the hls4ml model compared to the keras model. For my keras CNN all I do is a little pruning and no quantisation to keep it simple and still see problems.

You can see my code here:

https://github.com/olaiya/Bhousing

It should run straight out of the box. Any help would be very much appreciated.

cheers

Manny

thesps commented 2 years ago

Hi, is the problem really the synthesis, or getting good agreement between hls4ml and Keras so far?

For a model trained with floating point weights and activations, one needs to do some 'profiling' to choose appropriate fixed point types. The main tools are hls4ml.model.profiling.numerical and hls4ml.model.profiling.compare which produce some plots you can use to tune the configuration. Part 2 of the tutorial shows a bit of detail.

I'd also add that doing some quantisation of the model at training time can be helpful. If you quantise a weight or activation with QKeras, it's one less thing that you need to think about in the numerical profiling step, since hls4ml can pick up the data types from the model.

olaiya commented 2 years ago

Thanks for the reply. I believe the problem is synthesis. After synthesis the hls4ml generated model bears no resemblance the keras model. Hopefully running my example will demonstrate what I mean.

thesps commented 2 years ago

From your example (thanks for providing something easy to run), I made some small tweaks:

This pair of plots suggests that the default precision of ap_fixed<16,6> doesn't work throughout this model:

The left hand plot is the distribution of the output of each layer of the Keras model, while the grey box is the range of values that are representable by the precision config of the HLSModel. It's okay if the box and whisker isn't totally contained to the left (small values), but you'll see the type of bad matching you report if they don't overlap to the right (large values). You can more or less read off the exponent on the x axis to choose the number of integer bits.

The right hand plot is the distribution of the output of each layer of the HLSModel, evaluated with the fixed point precision. By definition the box and whiskers will be contained within the grey boxes, but you can see that the two plots don't match.

You can start to change settings like:

hls_config['LayerName']['conv1d_input']['Precision']['result'] = 'ap_fixed<16,10>'
hls_config['LayerName']['conv1d']['Precision']['result'] = 'ap_fixed<16,10>'

With that you get:

It's also worth noting that the input layer isn't included yet in these plots, so you'd want to change the precision of that (as shown just above) to match the scale of the dataset, since I think it also can't be contained in ap_fixed<16,6>.

Some other things you can do to make the model more likely to work with hls4ml's default precision settings:

olaiya commented 2 years ago

Thanks a lot for your quick replies. This is really helpful. I will try this.

olaiya commented 2 years ago

Sorry for the delay. I have only just been able to look at this. I have increased the bit size of the variables describing my CNN. I must be still missing something as my synthesized CNN is a poor representation of my tensorflow CNN. I've updated my code to demonstrate my issue. You should be able to checkout and run the following repo:

https://github.com/olaiya/Bhousing

Any help would be really appreciated.

cheers

Manny

olaiya commented 2 years ago

Hello

Any help with this issue would be really appreciated

cheers

Manny

vloncar commented 2 years ago

Hi Manny,

You were given a few more tips to try, did you try them? I don't see that in the code you shared. Also, you are profiling base model, but converting a trained pruned one. Bad idea.

Cheers, Vladimir

olaiya commented 2 years ago

Hi Vladimir

Thanks for the reply. I tried expanding the default bit precision to cover the range of the Keras model, as I thought the range difference in precision was my problem. The hls model still doesn't replicate the Keras model. I also thought that scaling the dataset, regularisation and quantisation helped with the default precision encompassing the precision of Keras and was not a requirement?

Thanks for pointing out the mistake in my example. I have removed all pruning. Now I just have a simple CNN for regression, expand the weights and bias precision for my hls model and then try and generate a hls model based on the keras model. I can't get a decent accuracy for my hls model (basically there is no resemblance between the two). The version now in github reflects the above changes I made.

cheers

Manny