Issue with predict and HLS Compilation

LordScarface commented 2 years ago

Hi,

I'm trying to implement this model using hls4ml. The model conversion seems to be working but when calling model.predict() with the test data I get:

python3: firmware/nnet_utils/nnet_conv2d_stream.h:73: void nnet::conv_2d_buffer_cl(hls::stream<srcType>&, 
hls::stream<dstType>&, typename CONFIG_T::weight_t*, typename CONFIG_T::bias_t*) 
[with data_T = nnet::array<ap_fixed<16, 6>, 2>; res_T = nnet::array<ap_fixed<16, 6>, 32>; CONFIG_T = config2; typename CONFIG_T::weight_t = ap_fixed<16, 6>; 
typename CONFIG_T::bias_t = ap_fixed<16, 6>]: 
Assertion `CONFIG_T::pad_top == 0 && CONFIG_T::pad_bottom == 0 && CONFIG_T::pad_left == 0 && CONFIG_T::pad_right == 0' failed.
Aborted (core dumped)

When I open the generated HLS Project, it looks fine and when using backend='VivadoAccelerator' the HLS Compilation actually completes without any errors, however it only takes a minute, Resource Usage is around 0% and the summary says ? for Latency and Interval. I assume maybe only the myproject_axi() wrapper was processed?

When changing the backend to backend='Vivado' and running the HLS Compilation, it stops with the error:

ERROR: [HLS 200-474] Empty dataflow region in myproject (it may have been optimized away due to the absence of outputs)
ERROR: [HLS 200-70] Failed building synthesis data model.
command 'ap_source' returned error code
    while executing
"source /home/lukas/Documents/HiWi/tmpp/myproject_prj/solution1/csynth.tcl"
    invoked from within
"hls::main /home/lukas/Documents/HiWi/tmpp/myproject_prj/solution1/csynth.tcl"
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 hls::main {*}$args"
    (procedure "hls_proc" line 5)
    invoked from within
"hls_proc $argv"
Finished C synthesis.

So I assume there is something wrong with the code generated by hls4ml?

I'm using version 0.6.0 of hls4ml, but I get the same issue with the master branch and Version 0.5.0.

I hope someone can help me out or point me in the right direction, thank you in advance, Best Regards

My Code used for generating the HLS Project

```python from yaml import load_all import tensorflow as tf from tensorflow import keras from tensorflow import Tensor print("Tensorflow version is ", tf.__version__) print('Keras version : ',keras.__version__) import numpy as np import os, sys from tensorflow.keras.models import Model, load_model import hls4ml from qkeras.utils import _add_supported_quantized_objects from sklearn.metrics import accuracy_score co = {} _add_supported_quantized_objects(co) model = load_model('./fp_model/resnet_fp_model.h5', custom_objects=co) with open('./Dataset/X_train.npy', 'rb') as f: X_train = np.load(f) with open('./Dataset/X_test.npy', 'rb') as f: X_test = np.load(f) with open('./Dataset/Y_test.npy', 'rb') as f: Y_test = np.load(f) with open('./Dataset/Y_train.npy', 'rb') as f: Y_train = np.load(f) hls4ml.model.optimizer.OutputRoundingSaturationMode.layers = ['Activation'] hls4ml.model.optimizer.OutputRoundingSaturationMode.rounding_mode = 'AP_RND' hls4ml.model.optimizer.OutputRoundingSaturationMode.saturation_mode = 'AP_SAT' hls_config = hls4ml.utils.config_from_keras_model(model, granularity='name') hls_config['Model']['Precision'] = 'ap_fixed<16,6>' hls_config['Model']['ReuseFactor'] = 1 for Layer in hls_config['LayerName'].keys(): hls_config['LayerName'][Layer]['Strategy'] = 'Latency' hls_config['LayerName'][Layer]['ReuseFactor'] = 1 #If you want best numerical performance for high-accuray models, while the default latency strategy is faster but numerically more unstable hls_config['LayerName']['softmax']['Strategy'] = 'Stable' hls_model = hls4ml.converters.convert_from_keras_model(model, hls_config=hls_config, io_type='io_stream', backend='Vivado', output_dir='tmpp/', part='xczu7ev-ffvc1156-2-e') hls_model.compile() os.environ['PATH'] = '/tools/Xilinx/Vivado/2019.1/bin:' + os.environ['PATH'] hls_model.build(csim=False, synth=False, vsynth=False) Y_pred = hls_model.predict(np.ascontiguousarray(X_test)) y_pred = np.argmax(Y_pred, axis = 1) y_actual = np.argmax(Y_test, axis = 1) accuracy = accuracy_score(y_actual, y_pred) print("Accuracy: ", accuracy) ```

The output generated

``` Tensorflow version is 2.8.0 Keras version : 2.8.0 Interpreting Model Topology: Layer name: rf_input, layer type: Input Layer name: conv2d, layer type: Conv2D -> Activation (linear), layer name: conv2d Layer name: batch_normalization, layer type: BatchNormalization Layer name: activation, layer type: Activation Layer name: conv2d_1, layer type: Conv2D -> Activation (linear), layer name: conv2d_1 Layer name: batch_normalization_1, layer type: BatchNormalization Layer name: activation_1, layer type: Activation Layer name: conv2d_2, layer type: Conv2D -> Activation (linear), layer name: conv2d_2 Layer name: batch_normalization_2, layer type: BatchNormalization Layer name: activation_2, layer type: Activation Layer name: conv2d_3, layer type: Conv2D -> Activation (linear), layer name: conv2d_3 Layer name: batch_normalization_3, layer type: BatchNormalization Layer name: add, layer type: Add Layer name: activation_3, layer type: Activation Layer name: conv2d_4, layer type: Conv2D -> Activation (linear), layer name: conv2d_4 Layer name: batch_normalization_4, layer type: BatchNormalization Layer name: activation_4, layer type: Activation Layer name: conv2d_5, layer type: Conv2D -> Activation (linear), layer name: conv2d_5 Layer name: batch_normalization_5, layer type: BatchNormalization Layer name: add_1, layer type: Add Layer name: activation_5, layer type: Activation Layer name: max_pooling2d, layer type: MaxPooling2D Layer name: conv2d_6, layer type: Conv2D -> Activation (linear), layer name: conv2d_6 Layer name: batch_normalization_6, layer type: BatchNormalization Layer name: activation_6, layer type: Activation Layer name: conv2d_7, layer type: Conv2D -> Activation (linear), layer name: conv2d_7 Layer name: batch_normalization_7, layer type: BatchNormalization Layer name: activation_7, layer type: Activation Layer name: conv2d_8, layer type: Conv2D -> Activation (linear), layer name: conv2d_8 Layer name: batch_normalization_8, layer type: BatchNormalization Layer name: add_2, layer type: Add Layer name: activation_8, layer type: Activation Layer name: conv2d_9, layer type: Conv2D -> Activation (linear), layer name: conv2d_9 Layer name: batch_normalization_9, layer type: BatchNormalization Layer name: activation_9, layer type: Activation Layer name: conv2d_10, layer type: Conv2D -> Activation (linear), layer name: conv2d_10 Layer name: batch_normalization_10, layer type: BatchNormalization Layer name: add_3, layer type: Add Layer name: activation_10, layer type: Activation Layer name: max_pooling2d_1, layer type: MaxPooling2D Layer name: conv2d_11, layer type: Conv2D -> Activation (linear), layer name: conv2d_11 Layer name: batch_normalization_11, layer type: BatchNormalization Layer name: activation_11, layer type: Activation Layer name: conv2d_12, layer type: Conv2D -> Activation (linear), layer name: conv2d_12 Layer name: batch_normalization_12, layer type: BatchNormalization Layer name: activation_12, layer type: Activation Layer name: conv2d_13, layer type: Conv2D -> Activation (linear), layer name: conv2d_13 Layer name: batch_normalization_13, layer type: BatchNormalization Layer name: add_4, layer type: Add Layer name: activation_13, layer type: Activation Layer name: conv2d_14, layer type: Conv2D -> Activation (linear), layer name: conv2d_14 Layer name: batch_normalization_14, layer type: BatchNormalization Layer name: activation_14, layer type: Activation Layer name: conv2d_15, layer type: Conv2D -> Activation (linear), layer name: conv2d_15 Layer name: batch_normalization_15, layer type: BatchNormalization Layer name: add_5, layer type: Add Layer name: activation_15, layer type: Activation Layer name: max_pooling2d_2, layer type: MaxPooling2D Layer name: conv2d_16, layer type: Conv2D -> Activation (linear), layer name: conv2d_16 Layer name: batch_normalization_16, layer type: BatchNormalization Layer name: activation_16, layer type: Activation Layer name: conv2d_17, layer type: Conv2D -> Activation (linear), layer name: conv2d_17 Layer name: batch_normalization_17, layer type: BatchNormalization Layer name: activation_17, layer type: Activation Layer name: conv2d_18, layer type: Conv2D -> Activation (linear), layer name: conv2d_18 Layer name: batch_normalization_18, layer type: BatchNormalization Layer name: add_6, layer type: Add Layer name: activation_18, layer type: Activation Layer name: conv2d_19, layer type: Conv2D -> Activation (linear), layer name: conv2d_19 Layer name: batch_normalization_19, layer type: BatchNormalization Layer name: activation_19, layer type: Activation Layer name: conv2d_20, layer type: Conv2D -> Activation (linear), layer name: conv2d_20 Layer name: batch_normalization_20, layer type: BatchNormalization Layer name: add_7, layer type: Add Layer name: activation_20, layer type: Activation Layer name: max_pooling2d_3, layer type: MaxPooling2D Layer name: conv2d_21, layer type: Conv2D -> Activation (linear), layer name: conv2d_21 Layer name: batch_normalization_21, layer type: BatchNormalization Layer name: activation_21, layer type: Activation Layer name: global_average_pooling2d, layer type: GlobalAveragePooling2D Layer name: dense, layer type: Dense -> Activation (relu), layer name: dense Layer name: dense_1, layer type: Dense -> Activation (relu), layer name: dense_1 Layer name: dense_2, layer type: Dense -> Activation (linear), layer name: dense_2 Layer name: softmax, layer type: Activation Interpreting Model Topology: Layer name: rf_input, layer type: InputLayer, input shapes: [[None, 1024, 1, 2]], output shape: [None, 1024, 1, 2] Layer name: conv2d, layer type: Conv2D, input shapes: [[None, 1024, 1, 2]], output shape: [None, 1024, 1, 32] Layer name: batch_normalization, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: activation, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: conv2d_1, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: batch_normalization_1, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: activation_1, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: conv2d_2, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: batch_normalization_2, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: activation_2, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: conv2d_3, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: batch_normalization_3, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: add, layer type: Merge, input shapes: [[None, 1024, 1, 32], [None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: activation_3, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: conv2d_4, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: batch_normalization_4, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: activation_4, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: conv2d_5, layer type: Conv2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: batch_normalization_5, layer type: BatchNormalization, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: add_1, layer type: Merge, input shapes: [[None, 1024, 1, 32], [None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: activation_5, layer type: Activation, input shapes: [[None, 1024, 1, 32]], output shape: [None, 1024, 1, 32] Layer name: max_pooling2d, layer type: MaxPooling2D, input shapes: [[None, 1024, 1, 32]], output shape: [None, 512, 1, 32] Layer name: conv2d_6, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: batch_normalization_6, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: activation_6, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: conv2d_7, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: batch_normalization_7, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: activation_7, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: conv2d_8, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: batch_normalization_8, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: add_2, layer type: Merge, input shapes: [[None, 512, 1, 32], [None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: activation_8, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: conv2d_9, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: batch_normalization_9, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: activation_9, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: conv2d_10, layer type: Conv2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: batch_normalization_10, layer type: BatchNormalization, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: add_3, layer type: Merge, input shapes: [[None, 512, 1, 32], [None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: activation_10, layer type: Activation, input shapes: [[None, 512, 1, 32]], output shape: [None, 512, 1, 32] Layer name: max_pooling2d_1, layer type: MaxPooling2D, input shapes: [[None, 512, 1, 32]], output shape: [None, 256, 1, 32] Layer name: conv2d_11, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: batch_normalization_11, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: activation_11, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: conv2d_12, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: batch_normalization_12, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: activation_12, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: conv2d_13, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: batch_normalization_13, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: add_4, layer type: Merge, input shapes: [[None, 256, 1, 32], [None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: activation_13, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: conv2d_14, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: batch_normalization_14, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: activation_14, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: conv2d_15, layer type: Conv2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: batch_normalization_15, layer type: BatchNormalization, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: add_5, layer type: Merge, input shapes: [[None, 256, 1, 32], [None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: activation_15, layer type: Activation, input shapes: [[None, 256, 1, 32]], output shape: [None, 256, 1, 32] Layer name: max_pooling2d_2, layer type: MaxPooling2D, input shapes: [[None, 256, 1, 32]], output shape: [None, 128, 1, 32] Layer name: conv2d_16, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: batch_normalization_16, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: activation_16, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: conv2d_17, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: batch_normalization_17, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: activation_17, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: conv2d_18, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: batch_normalization_18, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: add_6, layer type: Merge, input shapes: [[None, 128, 1, 32], [None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: activation_18, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: conv2d_19, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: batch_normalization_19, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: activation_19, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: conv2d_20, layer type: Conv2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: batch_normalization_20, layer type: BatchNormalization, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: add_7, layer type: Merge, input shapes: [[None, 128, 1, 32], [None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: activation_20, layer type: Activation, input shapes: [[None, 128, 1, 32]], output shape: [None, 128, 1, 32] Layer name: max_pooling2d_3, layer type: MaxPooling2D, input shapes: [[None, 128, 1, 32]], output shape: [None, 64, 1, 32] Layer name: conv2d_21, layer type: Conv2D, input shapes: [[None, 64, 1, 32]], output shape: [None, 64, 1, 32] Layer name: batch_normalization_21, layer type: BatchNormalization, input shapes: [[None, 64, 1, 32]], output shape: [None, 64, 1, 32] Layer name: activation_21, layer type: Activation, input shapes: [[None, 64, 1, 32]], output shape: [None, 64, 1, 32] Layer name: reshape, layer type: Reshape, input shapes: [[None, 64, 1, 32]], output shape: [None, 8, 8, 32] Layer name: global_average_pooling2d, layer type: GlobalAveragePooling2D, input shapes: [[None, 8, 8, 32]], output shape: [None, 32] Layer name: dense, layer type: Dense, input shapes: [[None, 32]], output shape: [None, 256] Layer name: dense_1, layer type: Dense, input shapes: [[None, 256]], output shape: [None, 128] Layer name: dense_2, layer type: Dense, input shapes: [[None, 128]], output shape: [None, 22] Layer name: softmax, layer type: Softmax, input shapes: [[None, 22]], output shape: [None, 22] Creating HLS model WARNING: Config parameter "Precision" overwrites an existing attribute in layer "conv2d_1" (PointwiseConv2D) WARNING: Config parameter "ReuseFactor" overwrites an existing attribute in layer "conv2d_1" (PointwiseConv2D) WARNING: Config parameter "Strategy" overwrites an existing attribute in layer "conv2d_1" (PointwiseConv2D) WARNING: Config parameter "Precision" overwrites an existing attribute in layer "conv2d_6" (PointwiseConv2D) WARNING: Config parameter "ReuseFactor" overwrites an existing attribute in layer "conv2d_6" (PointwiseConv2D) WARNING: Config parameter "Strategy" overwrites an existing attribute in layer "conv2d_6" (PointwiseConv2D) WARNING: Config parameter "Precision" overwrites an existing attribute in layer "conv2d_11" (PointwiseConv2D) WARNING: Config parameter "ReuseFactor" overwrites an existing attribute in layer "conv2d_11" (PointwiseConv2D) WARNING: Config parameter "Strategy" overwrites an existing attribute in layer "conv2d_11" (PointwiseConv2D) WARNING: Config parameter "Precision" overwrites an existing attribute in layer "conv2d_16" (PointwiseConv2D) WARNING: Config parameter "ReuseFactor" overwrites an existing attribute in layer "conv2d_16" (PointwiseConv2D) WARNING: Config parameter "Strategy" overwrites an existing attribute in layer "conv2d_16" (PointwiseConv2D) Writing HLS project Done ****** Vivado(TM) HLS - High-Level Synthesis from C, C++ and SystemC v2019.1 (64-bit) **** SW Build 2552052 on Fri May 24 14:47:09 MDT 2019 **** IP Build 2548770 on Fri May 24 18:01:18 MDT 2019 ** Copyright 1986-2019 Xilinx, Inc. All Rights Reserved. source /tools/Xilinx/Vivado/2019.1/scripts/vivado_hls/hls.tcl -notrace INFO: [HLS 200-10] Running '/tools/Xilinx/Vivado/2019.1/bin/unwrapped/lnx64.o/vivado_hls' INFO: [HLS 200-10] For user 'lukas' on host 'ubuntu' (Linux_x86_64 version 5.13.0-40-generic) on Mon Apr 25 11:03:05 CEST 2022 INFO: [HLS 200-10] On os Ubuntu 20.04.2 LTS INFO: [HLS 200-10] In directory '/home/lukas/Documents/HiWi/tmppp' Sourcing Tcl script 'build_prj.tcl' INFO: [HLS 200-10] Creating and opening project '/home/lukas/Documents/HiWi/tmppp/myproject_prj'. INFO: [HLS 200-10] Adding design file 'firmware/myproject.cpp' to the project INFO: [HLS 200-10] Adding test bench file 'myproject_test.cpp' to the project INFO: [HLS 200-10] Adding test bench file 'firmware/weights' to the project INFO: [HLS 200-10] Adding test bench file 'tb_data' to the project INFO: [HLS 200-10] Creating and opening solution '/home/lukas/Documents/HiWi/tmppp/myproject_prj/solution1'. INFO: [XFORM 203-101] Allowed max sub elements number after partition is 4096. INFO: [XFORM 203-1161] The maximum of name length is set into 60. INFO: [HLS 200-10] Setting target device to 'xczu7ev-ffvc1156-2-e' INFO: [SYN 201-201] Setting up clock 'default' with a period of 5ns. INFO: [Common 17-206] Exiting vivado_hls at Mon Apr 25 11:03:05 2022... Synthesis report not found. python3: firmware/nnet_utils/nnet_conv2d_stream.h:73: void nnet::conv_2d_buffer_cl(hls::stream&, hls::stream&, typename CONFIG_T::weight_t*, typename CONFIG_T::bias_t*) [with data_T = nnet::array, 2>; res_T = nnet::array, 32>; CONFIG_T = config2; typename CONFIG_T::weight_t = ap_fixed<16, 6>; typename CONFIG_T::bias_t = ap_fixed<16, 6>]: Assertion `CONFIG_T::pad_top == 0 && CONFIG_T::pad_bottom == 0 && CONFIG_T::pad_left == 0 && CONFIG_T::pad_right == 0' failed. Aborted (core dumped) ```

lloo099 commented 2 years ago

Hi @LordScarface , thanks for your post. Do you have padding in your conv2d in the Resnet model? Would you mind provide your model file also, thanks

LordScarface commented 2 years ago

Hi and thank you for the reply!

Yes, the Conv2D Layers have the padding set to 'same'

here is the Code used for generating the model

```python def resnet_block(input_data, filters, conv_size): x = Conv2D(filters, 1, activation=None, padding='same')(input_data) x = BatchNormalization()(x) x = Activation('relu')(x) x = Conv2D(filters, conv_size, activation=None, padding='same')(x) x = BatchNormalization()(x) x = Activation('relu')(x) x = Conv2D(filters, conv_size, activation=None, padding='same')(x) x = BatchNormalization()(x) x = Add()([x, input_data]) x = Activation('relu')(x) y = Conv2D(filters, conv_size, activation=None, padding='same')(x) y = BatchNormalization()(y) y = Activation('relu')(y) y = Conv2D(filters, conv_size, activation=None, padding='same')(y) y = BatchNormalization()(y) y = Add()([y, x]) y = Activation('relu')(y) z = MaxPooling2D(2, strides = (2,1), padding = 'same') (y) return z num_resnet_blocks = 4 num_filters = 32 kernel_size = 5,1 rf_input = Input(shape=input_shp, name = 'rf_input') x = Conv2D(num_filters, (kernel_size), activation=None, padding='same')(rf_input) x = BatchNormalization()(x) x = Activation('relu')(x) for i in range(num_resnet_blocks): x = resnet_block(x, num_filters, (kernel_size)) x = Conv2D(num_filters, (kernel_size), activation=None, padding = 'same')(x) x = BatchNormalization()(x) x = Activation('relu')(x) # use if number of resnet blocks = 6 #x = Reshape((4,4,num_filters), input_shape = (16,1,num_filters)) (x) # use if number of resent blocks = 4 x = Reshape((8,8,num_filters), input_shape = (32,1,num_filters)) (x) x = GlobalAveragePooling2D()(x) dense_1 = Dense(256, activation='relu')(x) dropout_1 = Dropout(0.5)(dense_1) dense_2 = Dense(128, activation='relu')(dropout_1) dropout_2 = Dropout(0.5)(dense_2) dense_3 = Dense(num_classes)(dropout_2) softmax = Activation('softmax', name = 'softmax')(dense_3) optimizer= Adam(learning_rate=0.00050) model = keras.Model(rf_input, softmax) model.compile(loss='categorical_crossentropy', metrics=["accuracy"]) ```

The Model Summary

``` Model: "model" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== rf_input (InputLayer) [(None, 1024, 1, 2) 0 [] ] conv2d (Conv2D) (None, 1024, 1, 32) 352 ['rf_input[0][0]'] batch_normalization (BatchNorm (None, 1024, 1, 32) 128 ['conv2d[0][0]'] alization) activation (Activation) (None, 1024, 1, 32) 0 ['batch_normalization[0][0]'] conv2d_1 (Conv2D) (None, 1024, 1, 32) 1056 ['activation[0][0]'] batch_normalization_1 (BatchNo (None, 1024, 1, 32) 128 ['conv2d_1[0][0]'] rmalization) activation_1 (Activation) (None, 1024, 1, 32) 0 ['batch_normalization_1[0][0]'] conv2d_2 (Conv2D) (None, 1024, 1, 32) 5152 ['activation_1[0][0]'] batch_normalization_2 (BatchNo (None, 1024, 1, 32) 128 ['conv2d_2[0][0]'] rmalization) activation_2 (Activation) (None, 1024, 1, 32) 0 ['batch_normalization_2[0][0]'] conv2d_3 (Conv2D) (None, 1024, 1, 32) 5152 ['activation_2[0][0]'] batch_normalization_3 (BatchNo (None, 1024, 1, 32) 128 ['conv2d_3[0][0]'] rmalization) add (Add) (None, 1024, 1, 32) 0 ['batch_normalization_3[0][0]', 'activation[0][0]'] activation_3 (Activation) (None, 1024, 1, 32) 0 ['add[0][0]'] conv2d_4 (Conv2D) (None, 1024, 1, 32) 5152 ['activation_3[0][0]'] batch_normalization_4 (BatchNo (None, 1024, 1, 32) 128 ['conv2d_4[0][0]'] rmalization) activation_4 (Activation) (None, 1024, 1, 32) 0 ['batch_normalization_4[0][0]'] conv2d_5 (Conv2D) (None, 1024, 1, 32) 5152 ['activation_4[0][0]'] batch_normalization_5 (BatchNo (None, 1024, 1, 32) 128 ['conv2d_5[0][0]'] rmalization) add_1 (Add) (None, 1024, 1, 32) 0 ['batch_normalization_5[0][0]', 'activation_3[0][0]'] activation_5 (Activation) (None, 1024, 1, 32) 0 ['add_1[0][0]'] max_pooling2d (MaxPooling2D) (None, 512, 1, 32) 0 ['activation_5[0][0]'] conv2d_6 (Conv2D) (None, 512, 1, 32) 1056 ['max_pooling2d[0][0]'] batch_normalization_6 (BatchNo (None, 512, 1, 32) 128 ['conv2d_6[0][0]'] rmalization) activation_6 (Activation) (None, 512, 1, 32) 0 ['batch_normalization_6[0][0]'] conv2d_7 (Conv2D) (None, 512, 1, 32) 5152 ['activation_6[0][0]'] batch_normalization_7 (BatchNo (None, 512, 1, 32) 128 ['conv2d_7[0][0]'] rmalization) activation_7 (Activation) (None, 512, 1, 32) 0 ['batch_normalization_7[0][0]'] conv2d_8 (Conv2D) (None, 512, 1, 32) 5152 ['activation_7[0][0]'] batch_normalization_8 (BatchNo (None, 512, 1, 32) 128 ['conv2d_8[0][0]'] rmalization) add_2 (Add) (None, 512, 1, 32) 0 ['batch_normalization_8[0][0]', 'max_pooling2d[0][0]'] activation_8 (Activation) (None, 512, 1, 32) 0 ['add_2[0][0]'] conv2d_9 (Conv2D) (None, 512, 1, 32) 5152 ['activation_8[0][0]'] batch_normalization_9 (BatchNo (None, 512, 1, 32) 128 ['conv2d_9[0][0]'] rmalization) activation_9 (Activation) (None, 512, 1, 32) 0 ['batch_normalization_9[0][0]'] conv2d_10 (Conv2D) (None, 512, 1, 32) 5152 ['activation_9[0][0]'] batch_normalization_10 (BatchN (None, 512, 1, 32) 128 ['conv2d_10[0][0]'] ormalization) add_3 (Add) (None, 512, 1, 32) 0 ['batch_normalization_10[0][0]', 'activation_8[0][0]'] activation_10 (Activation) (None, 512, 1, 32) 0 ['add_3[0][0]'] max_pooling2d_1 (MaxPooling2D) (None, 256, 1, 32) 0 ['activation_10[0][0]'] conv2d_11 (Conv2D) (None, 256, 1, 32) 1056 ['max_pooling2d_1[0][0]'] batch_normalization_11 (BatchN (None, 256, 1, 32) 128 ['conv2d_11[0][0]'] ormalization) activation_11 (Activation) (None, 256, 1, 32) 0 ['batch_normalization_11[0][0]'] conv2d_12 (Conv2D) (None, 256, 1, 32) 5152 ['activation_11[0][0]'] batch_normalization_12 (BatchN (None, 256, 1, 32) 128 ['conv2d_12[0][0]'] ormalization) activation_12 (Activation) (None, 256, 1, 32) 0 ['batch_normalization_12[0][0]'] conv2d_13 (Conv2D) (None, 256, 1, 32) 5152 ['activation_12[0][0]'] batch_normalization_13 (BatchN (None, 256, 1, 32) 128 ['conv2d_13[0][0]'] ormalization) add_4 (Add) (None, 256, 1, 32) 0 ['batch_normalization_13[0][0]', 'max_pooling2d_1[0][0]'] activation_13 (Activation) (None, 256, 1, 32) 0 ['add_4[0][0]'] conv2d_14 (Conv2D) (None, 256, 1, 32) 5152 ['activation_13[0][0]'] batch_normalization_14 (BatchN (None, 256, 1, 32) 128 ['conv2d_14[0][0]'] ormalization) activation_14 (Activation) (None, 256, 1, 32) 0 ['batch_normalization_14[0][0]'] conv2d_15 (Conv2D) (None, 256, 1, 32) 5152 ['activation_14[0][0]'] batch_normalization_15 (BatchN (None, 256, 1, 32) 128 ['conv2d_15[0][0]'] ormalization) add_5 (Add) (None, 256, 1, 32) 0 ['batch_normalization_15[0][0]', 'activation_13[0][0]'] activation_15 (Activation) (None, 256, 1, 32) 0 ['add_5[0][0]'] max_pooling2d_2 (MaxPooling2D) (None, 128, 1, 32) 0 ['activation_15[0][0]'] conv2d_16 (Conv2D) (None, 128, 1, 32) 1056 ['max_pooling2d_2[0][0]'] batch_normalization_16 (BatchN (None, 128, 1, 32) 128 ['conv2d_16[0][0]'] ormalization) activation_16 (Activation) (None, 128, 1, 32) 0 ['batch_normalization_16[0][0]'] conv2d_17 (Conv2D) (None, 128, 1, 32) 5152 ['activation_16[0][0]'] batch_normalization_17 (BatchN (None, 128, 1, 32) 128 ['conv2d_17[0][0]'] ormalization) activation_17 (Activation) (None, 128, 1, 32) 0 ['batch_normalization_17[0][0]'] conv2d_18 (Conv2D) (None, 128, 1, 32) 5152 ['activation_17[0][0]'] batch_normalization_18 (BatchN (None, 128, 1, 32) 128 ['conv2d_18[0][0]'] ormalization) add_6 (Add) (None, 128, 1, 32) 0 ['batch_normalization_18[0][0]', 'max_pooling2d_2[0][0]'] activation_18 (Activation) (None, 128, 1, 32) 0 ['add_6[0][0]'] conv2d_19 (Conv2D) (None, 128, 1, 32) 5152 ['activation_18[0][0]'] batch_normalization_19 (BatchN (None, 128, 1, 32) 128 ['conv2d_19[0][0]'] ormalization) activation_19 (Activation) (None, 128, 1, 32) 0 ['batch_normalization_19[0][0]'] conv2d_20 (Conv2D) (None, 128, 1, 32) 5152 ['activation_19[0][0]'] batch_normalization_20 (BatchN (None, 128, 1, 32) 128 ['conv2d_20[0][0]'] ormalization) add_7 (Add) (None, 128, 1, 32) 0 ['batch_normalization_20[0][0]', 'activation_18[0][0]'] activation_20 (Activation) (None, 128, 1, 32) 0 ['add_7[0][0]'] max_pooling2d_3 (MaxPooling2D) (None, 64, 1, 32) 0 ['activation_20[0][0]'] conv2d_21 (Conv2D) (None, 64, 1, 32) 5152 ['max_pooling2d_3[0][0]'] batch_normalization_21 (BatchN (None, 64, 1, 32) 128 ['conv2d_21[0][0]'] ormalization) activation_21 (Activation) (None, 64, 1, 32) 0 ['batch_normalization_21[0][0]'] reshape (Reshape) (None, 8, 8, 32) 0 ['activation_21[0][0]'] global_average_pooling2d (Glob (None, 32) 0 ['reshape[0][0]'] alAveragePooling2D) dense (Dense) (None, 256) 8448 ['global_average_pooling2d[0][0]' ] dropout (Dropout) (None, 256) 0 ['dense[0][0]'] dense_1 (Dense) (None, 128) 32896 ['dropout[0][0]'] dropout_1 (Dropout) (None, 128) 0 ['dense_1[0][0]'] dense_2 (Dense) (None, 22) 2838 ['dropout_1[0][0]'] softmax (Activation) (None, 22) 0 ['dense_2[0][0]'] ================================================================================================== Total params: 139,158 Trainable params: 137,750 Non-trainable params: 1,408 __________________________________________________________________________________________________ None ```

This is the trained model (trained only on the first 200k Samples of the Dataset)

lloo099 commented 2 years ago

Thanks, I am not sure the 'same' padding means 0 padding in your models. Normally, conv2d of hls4ml has zero padding but can configure it. I suggest you test a small model at first.

LordScarface commented 2 years ago

Thank you for the input, I switched the model to use Conv1D instead of Conv2D but the issue remained. I was able to fix it by removing the 'same' padding from the MaxPooling2D (now MaxPooling1D) Layers of the ResNet Blocks. Maybe there is an issue with 'same' padding for the pooling Layers?

Anyway, model passes synthesis now but Resource Usage seems high (Target ZCU104) :

Strategy is Resource and ReuseFactor has been increased to 16 for the large Dense Layer with 32.896 parameters.

Is this to be expected or is it too high?

lloo099 commented 2 years ago

As far as I know, the resource strategy is not available now. In this way, it mainly supports the latency strategy. If you want to get a good balance between latency and resources. I suggest that you may improve the coding from the hardware level. Hls4ml is more friendly with lightweight models.

LordScarface commented 2 years ago

The Resource Strategy has worked for me in the past, I saw that in #534 you were able to get past synthesis with the VGG-16 model, when choosing the Latency Strategy I get issues with Layers containing more that 4096 parameters, how did you adress that?

My goal right now is to implement the model fully parallel without any optimizations and then to see how much things can be improved by quantization, pruning, etc.

lloo099 commented 2 years ago

The Resource Strategy has worked for me in the past, I saw that in #534 you were able to get past synthesis with the VGG-16 model, when choosing the Latency Strategy I get issues with Layers containing more that 4096 parameters, how did you adress that?

My goal right now is to implement the model fully parallel without any optimizations and then to see how much things can be improved by quantization, pruning, etc.

Nice and u can share your resource strategy for ResNet if it's possible. Because it's too much array partition Here. You may consider to modify it or just comment.

LordScarface commented 2 years ago

For the Resource Strategy I just replaced this:

hls_config['Model']['Precision'] = 'ap_fixed<16,6>'
hls_config['Model']['ReuseFactor'] = 1

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = 1
#If you want best numerical performance for high-accuray models, while the default latency strategy is faster but numerically more unstable
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'

with this:

hls_config['Model']['Strategy'] = 'Resource'
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
hls_config['LayerName']['dense_28']['ReuseFactor'] = 16

I tried the Latency Strategy with commenting #pragma HLS ARRAY_PARTITION variable=mult complete as you suggested, but it did not help me, now I keep running out of memory during synthesis with $RDI PROG" "@$ , is this something you have experienced when working with larger models? Are 48 GB of RAM not enough?

lloo099 commented 2 years ago

For the Resource Strategy I just replaced this:
hls_config['Model']['Precision'] = 'ap_fixed<16,6>'
hls_config['Model']['ReuseFactor'] = 1

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = 1
#If you want best numerical performance for high-accuray models, while the default latency strategy is faster but numerically more unstable
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
with this:
hls_config['Model']['Strategy'] = 'Resource'
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
hls_config['LayerName']['dense_28']['ReuseFactor'] = 16
I tried the Latency Strategy with commenting #pragma HLS ARRAY_PARTITION variable=mult complete as you suggested, but it did not help me, now I keep running out of memory during synthesis with $RDI PROG" "@$ , is this something you have experienced when working with larger models? Are 48 GB of RAM not enough?

Ok, thanks. I comment this because the size of layer is over 4096 but not sure your problems. Yes, I try to compile VGG-16 which encounter same memory problem. So I think we should reduce precision to 8bits. In this way, it is possible to deploy.

lloo099 commented 2 years ago

For the Resource Strategy I just replaced this:
hls_config['Model']['Precision'] = 'ap_fixed<16,6>'
hls_config['Model']['ReuseFactor'] = 1

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = 1
#If you want best numerical performance for high-accuray models, while the default latency strategy is faster but numerically more unstable
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
with this:
hls_config['Model']['Strategy'] = 'Resource'
hls_config['LayerName']['softmax']['Strategy'] = 'Stable'
hls_config['LayerName']['dense_28']['ReuseFactor'] = 16
I tried the Latency Strategy with commenting #pragma HLS ARRAY_PARTITION variable=mult complete as you suggested, but it did not help me, now I keep running out of memory during synthesis with $RDI PROG" "@$ , is this something you have experienced when working with larger models? Are 48 GB of RAM not enough?

Btw, you can also try vitis_hls 2020.2.

wilfredkisku commented 2 years ago

Hi, I am also trying to work with the ResNet architecture and Inception architectures. While the unquantized models synthesize well with the hls4ml, there is an accuracy drop for quantized models for the above-mentioned architectures. It would be of great help if there be any insights into the issue that I am facing. #587

fastmachinelearning / hls4ml

Issue with predict and HLS Compilation #530