Save model weights in quantized fixed point representation instead of floating point

Details

Currently, hls4ml writes model weights to C++ header files included in the compiled HLS project. These weights are stored in floating point representation. Without delving into the source code it's difficult to tell but the weights appear to be rounded to a number of decimal places equal to the word length of the PTQ fixed precisions selected in hls4ml (see below).

fig1

We can run a csim to save the "true" quantized model parameters (see below, upper right). In our work with NI FPGA hardware (programmed via LabVIEW), we experienced issues where LabVIEW coerces the floating point weights (see below, left) to different fixed-point decimal representations (see below, bottom right) than Vivado HLS. Loading these LabVIEW parsed weights, we observe incorrect model outputs in comparison to csim. However, loading the quantized fixed-point weights produced by Vivado HLS instead yields correct predictions.

fig3

New behavior

It may be beneficial for future external weight functionality and those looking to view the "true" quantized weights associated with their model to save the model weights in their quantized decimal or integer representation instead of floating point.

Motivation

As hls4ml expands in reach and functionality, compatibility issues like these between Vivado HLS and other platforms such as LabVIEW may arise. Writing the true quantized weights may curb many potential issues.

Parts of hls4ml being affected

Vivado writer.

fastmachinelearning / hls4ml