fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.19k stars 390 forks source link

KeyError, when calling profiling.compare #736

Open JochiSt opened 1 year ago

JochiSt commented 1 year ago

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

Quick summary

When I call hls4ml.model.profiling.compare(model, hls_model, y_test) - with the norm_diff option - on a very simple fully dense network, I get the following error:

Traceback (most recent call last):
  File "traceHLSmodel.py", line 75, in <module>
    compareHLSmodel(model)
  File "traceHLSmodel.py", line 60, in compareHLSmodel
    compare_fig = hls4ml.model.profiling.compare(model, hls_model,
  File "/home/fpga_ai/venvs/HLS4ML3.8/src/hls4ml/hls4ml/model/profiling.py", line 690, in compare
    f = _norm_diff(ymodel, ysim)
  File "/home/fpga_ai/venvs/HLS4ML3.8/src/hls4ml/hls4ml/model/profiling.py", line 601, in _norm_diff
    diff[key] = np.linalg.norm(ysim[key]-ymodel[key])
KeyError: 'layer_0_relu'

I get a similar error, when using the dist_diff plot option.

Details

The network, which I used:

    inputs = keras.Input(shape=(120,), name="waveform_input")

    layer_cnt=0
    x = keras.layers.Dense(8,
                            activation="relu",
                            #kernel_regularizer=keras.regularizers.l1(0.00001),
                            name="layer_%d"%(layer_cnt))(inputs)
    layer_cnt+=1

    x = keras.layers.Dense(8,
                            activation="relu",
                            #kernel_regularizer=keras.regularizers.l1(0.00001),
                            name="layer_%d"%(layer_cnt))(x)
    layer_cnt+=1

    x = keras.layers.Dense(6,
                            activation="relu",
                            #kernel_regularizer=keras.regularizers.l1(0.00001),
                            name="layer_%d"%(layer_cnt))(x)
    layer_cnt+=1

    # final layer for classification
    outputs = keras.layers.Dense(2, name="regression")(x)

I trained the network saved it to disk and reloaded it and send it into the compare function to see, where the differences between the HLS and the Keras model are.

Steps to Reproduce

Add what needs to be done to reproduce the bug. Add commented code examples and make sure to include the original model files / code, and the commit hash you are working on.

  1. Clone the hls4ml repository
  2. Checkout the master branch, with commit hash: a4b0e0c34a84d252559cac4a9f2f98e699964674
    System setup:
    [GCC 9.3.1 20200408 (Red Hat 9.3.1-2)]
    Numpy 1.22.3
    TensorFlow 2.8.0
    Keras 2.8.0
    QKeras 0.9.0
    HLS4ML 0.6.0.dev217+ga4b0e0c
  3. Run conversion on model file with code (see above)

Expected behavior

I would expect a plot of the differences in the layers

Actual behavior

Throws a python KeyError instead.

Optional

Possible fix

It seems, that the Keras layer name for the activation part is not always ending with _relu. For my regression network the last layer has the ending _linear and the others have the ending _function. But I've no clue, why my layer names are different from the ones assumed in HLS4ML.