fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.28k stars 415 forks source link

`transpose_3d` for `hls_stream` missing #712

Open AnouarITI opened 1 year ago

AnouarITI commented 1 year ago

I have the following model:

x_in = Input(shape=(1024,1,2))
x = Permute((3,1,2))(x_in)

x = Conv2D(8 , (7,1), padding='same', name='C1')(x)
x = ReLU(name='C1_relu')(x)
x = Conv2D(16, (7,1), padding='same', name='C2')(x)
x = ReLU(name='C2_relu')(x)
x = Conv2D(32, (7,1), padding='same', name='C3')(x)
x = ReLU(name='C3_relu')(x)
x = Conv2D(64, (7,1), padding='same', name='C4')(x)
x = ReLU(name='C4_relu')(x)

x = AveragePooling2D((1,256), (1,256), padding='same', name='AVG_pool_1')(x)
x = AveragePooling2D((1,4), (1,4), padding='same', name='AVG_pool_2')(x)

x = Flatten()(x)

x = Dense(256, name='D1')(x)
x = ReLU(name='D1_relu')(x)
x = Dense(22, name='D2')(x)
x_out = Softmax(name='D2_softmax')(x)

model = models.Model(x_in, x_out)

And I am trying to convert it to an hls_model, everything went alright until i hit an error which I think it is caused by the permute layer. Here is the rest of the code to replicate the behavior:

LOSS        = tf.keras.losses.CategoricalCrossentropy()
OPTIMIZER   = tf.keras.optimizers.Adam(learning_rate=1E-3)

model.compile(loss=LOSS, optimizer=OPTIMIZER, metrics=["accuracy"])
import hls4ml

hls4ml.model.optimizer.get_optimizer('output_rounding_saturation_mode').configure(layers=['Activation'])
hls4ml.model.optimizer.get_optimizer('output_rounding_saturation_mode').configure(rounding_mode='AP_RND')
hls4ml.model.optimizer.get_optimizer('output_rounding_saturation_mode').configure(saturation_mode='AP_SAT')

hls_config = hls4ml.utils.config_from_keras_model(model, granularity='name')

Precision = 'ap_fixed<16,6>'
Reuse_Factor = 1

hls_config['Model']['Precision'] = Precision
hls_config['Model']['ReuseFactor'] = Reuse_Factor

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Precision'] = Precision
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = Reuse_Factor

hls_config['LayerName']['D2_softmax']['Strategy'] = 'Stable'

cfg = hls4ml.converters.create_config(backend='Vivado')
cfg['IOType']     = 'io_stream' # Must set this if using CNNs!
cfg['HLSConfig']  = hls_config
cfg['KerasModel'] = model
cfg['OutputDir']  = 'model_16b_rf1/'
cfg['XilinxPart'] = 'xcvu9p-flga2104-2L-e'

hls_model = hls4ml.converters.keras_to_hls(cfg)
hls_model.compile()

and this is the error:

firmware/myproject.cpp: In function ‘void myproject(hls::stream<nnet::array<ap_fixed<16, 6>, 2> >&, hls::stream<nnet::array<ap_fixed<16, 6>, 22> >&)’:
firmware/myproject.cpp:61:52: error: cannot convert ‘hls::stream<nnet::array<ap_fixed<16, 6>, 2> >’ to ‘nnet::array<ap_fixed<16, 6>, 2>*’
   61 |     nnet::transpose_3d<input_t, layer2_t, config2>(input_1, layer2_out); // permute
      |                                                    ^~~~~~~
      |                                                    |
      |                                                    hls::stream<nnet::array<ap_fixed<16, 6>, 2> >
In file included from firmware/parameters.h:12,
                 from firmware/myproject.cpp:22:
firmware/nnet_utils/nnet_array.h:31:12: note:   initializing argument 1 of ‘void nnet::transpose_3d(data_T*, res_T*) [with data_T = nnet::array<ap_fixed<16, 6>, 2>; res_T = nnet::array<ap_fixed<16, 6>, 1>; CONFIG_T = config2]’
   31 |     data_T data[CONFIG_T::depth * CONFIG_T::height * CONFIG_T::width],
      |     ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
firmware/nnet_utils/nnet_array.h: In instantiation of ‘void nnet::transpose_3d(data_T*, res_T*) [with data_T = nnet::array<ap_fixed<16, 6>, 2>; res_T = nnet::array<ap_fixed<16, 6>, 1>; CONFIG_T = config2]’:
firmware/myproject.cpp:61:71:   required from here
firmware/nnet_utils/nnet_array.h:48:92: error: no match for ‘operator=’ (operand types are ‘nnet::array<ap_fixed<16, 6>, 1>’ and ‘nnet::array<ap_fixed<16, 6>, 2>’)
   48 |                 data_t[idx_t[0] * dims_t[1] * dims_t[2] + idx_t[1] * dims_t[2] + idx_t[2]] = data[idx[0] * dims[1] * dims[2] + idx[1] * dims[2] + idx[2]];
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
In file included from firmware/defines.h:6,
                 from firmware/myproject.h:27,
                 from firmware/myproject.cpp:21:
firmware/nnet_utils/nnet_types.h:26:12: note: candidate: ‘nnet::array<T, N>& nnet::array<T, N>::operator=(const nnet::array<T, N>&) [with T = ap_fixed<16, 6>; unsigned int N = 1]’
   26 |     array& operator=(const array &other) {
      |            ^~~~~~~~
firmware/nnet_utils/nnet_types.h:26:35: note:   no known conversion for argument 1 from ‘nnet::array<ap_fixed<16, 6>, 2>’ to ‘const nnet::array<ap_fixed<16, 6>, 1>&’
   26 |     array& operator=(const array &other) {
      |                      ~~~~~~~~~~~~~^~~~~
g++: error: myproject.o: No such file or directory

I thought that the Permute layer is already supported by hls4ml, or I am wrong?? How can I fix such an error?

jmduarte commented 1 year ago

Hi @AnouarITI, a couple of questions just to better understand the setup:

Regardless, Permute may induce some nonnegligible hardware resources to implement, so we would definitely recommend doing this manipulation as a data preprocessing instead (if possible).

AnouarITI commented 1 year ago

Hi @AnouarITI, a couple of questions just to better understand the setup:

  • Can you explain what the input shape (1024, 1, 2) signifies? This represents a radio signal (I and Q components)
  • The shape after the Permute layer is then (2, 1024, 1). Does this mean you have a 2x1024 image with 1 channel?

Regardless, Permute may induce some nonnegligible hardware resources to implement, so we would definitely recommend doing this manipulation as a data preprocessing instead (if possible).

Thank you for the answer. I will keep the pre-processing out of the model.

jmduarte commented 1 year ago

@AnouarITI even though I don't think you necessarily need it, I think this issue is that we currently don't have an implementation of transpose_3d for hls_stream like we do for standard arrays.

I'll leave this issue open for now, in case others need it (and want to submit a pull request to add it).

sei-jgwohlbier commented 4 months ago

Hi, I hit this error, as seen below, when trying to do a network with only a Conv2d coming from PyTorch, as a first step to a larger PyTorch model. Is it worth trying to work up a PR, do you think?

Output layers:  ['Transpose_0']
Input shape: [1024, 1, 2]
Topology:
Layer name: Conv_0, layer type: Conv, current shape: [[1, 1024, 1, 2], [16, 3, 1, 2]]
Layer name: Transpose_0, layer type: Transpose, current shape: [[1, 1024, 1, 16]]
Interpreting Model ...
Output layers:  ['Transpose_0']
Input shape: [1024, 1, 2]
Topology:
Layer name: Conv_0, layer type: Conv, current shape: [[1, 1024, 1, 2], [16, 3, 1, 2]]
Layer name: Transpose_0, layer type: Transpose, current shape: [[1, 1024, 1, 16]]
Creating HLS model
WARNING: Layer Conv2D_Conv_0 requires "dataflow" pipeline style. Switching to "dataflow" pipeline style.
Writing HLS project
Done
firmware/myproject.cpp: In function ‘void myproject(hls::stream<nnet::array<ap_fixed<32, 6>, 2> >&, hls::stream<nnet::array<ap_fixed<68, 16>, 1> >&)’:
firmware/myproject.cpp:35:89: error: no matching function for call to ‘transpose_3d<Conv2D_Conv_0_result_t, result_t, config4>(hls::stream<nnet::array<ap_fixed<68, 16>, 16> >&, hls::stream<nnet::array<ap_fixed<68, 16>, 1> >&)’
 3d<Conv2D_Conv_0_result_t, result_t, config4>(layer5_out, layer4_out); // Transpose_0
                                                                     ^
In file included from firmware/parameters.h:10:0,
                 from firmware/myproject.cpp:4:
firmware/nnet_utils/nnet_array.h:27:6: note: candidate: template<class data_T, class res_T, class CONFIG_T> void nnet::transpose_3d(data_T*, res_T*)
 void transpose_3d(data_T data[CONFIG_T::depth * CONFIG_T::height * CONFIG_T::width],
      ^~~~~~~~~~~~
firmware/nnet_utils/nnet_array.h:27:6: note:   template argument deduction/substitution failed:
firmware/myproject.cpp:35:89: note:   cannot convert ‘layer5_out’ (type ‘hls::stream<nnet::array<ap_fixed<68, 16>, 16> >’) to type ‘nnet::array<ap_fixed<68, 16>, 16>*’
 3d<Conv2D_Conv_0_result_t, result_t, config4>(layer5_out, layer4_out); // Transpose_0
                                                                     ^
g++: error: myproject.o: No such file or directory
Traceback (most recent call last):
  File "/home/hls4ml-user/work/ewstapp_research/isolate/NETWORK/test_hls4ml.py", line 49, in <module>
    hls_model.compile()
  File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/model/graph.py", line 678, in compile
    self._compile()
  File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/site-packages/hls4ml/model/graph.py", line 697, in _compile
    self._top_function_lib = ctypes.cdll.LoadLibrary(lib_name)
  File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "/home/hls4ml-user/miniconda3/envs/hls4ml/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: ./hls4mlprj_qonnx_Vitis/firmware/myproject-bdB67738.so: cannot open shared object file: No such file or directory