fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.25k stars 407 forks source link

Why is my CNN model fail during synthesis? #713

Closed AnouarITI closed 1 year ago

AnouarITI commented 1 year ago

I am trying to create an IP for my model using hls4ml. Here is my CNN code :


import tensorflow as tf
from tensorflow.keras.layers import Input, Permute, Conv2D, Dense, AveragePooling2D, Flatten, Activation
from tensorflow.keras import models
from tensorflow.keras.regularizers import l1

import os
os.environ['PATH'] = '/home/user/Xilinx/Vivado/2020.1/bin:' + os.environ['PATH']

x_in = Input(shape=(1024,1,2))

x = Conv2D(8 , (7,1), strides=(1,1), padding='same', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001), name='C1')(x_in)
x = Activation('relu', name='C1_relu')(x)
x = Conv2D(16, (7,1), strides=(1,1), padding='same', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001), name='C2')(x)
x = Activation('relu', name='C2_relu')(x)
x = Conv2D(32, (7,1), strides=(1,1), padding='same', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001), name='C3')(x)
x = Activation('relu', name='C3_relu')(x)
x = Conv2D(64, (7,1), strides=(1,1), padding='same', kernel_initializer='lecun_uniform', kernel_regularizer=l1(0.0001), name='C4')(x)
x = Activation('relu', name='C4_relu')(x)

x = AveragePooling2D((256,1), name='AVG_pool_1')(x)
x = AveragePooling2D((4,1), name='AVG_pool_2')(x)

x = Flatten()(x)

x = Dense(256, name='D1')(x)
x = Activation('relu', name='D1_relu')(x)
x = Dense(22, name='D2')(x)
x_out = Activation('softmax', name='D2_softmax')(x)

model = models.Model(x_in, x_out)

model.summary()

LOSS        = tf.keras.losses.CategoricalCrossentropy()
OPTIMIZER   = tf.keras.optimizers.Adam(learning_rate=1E-3)

model.compile(loss=LOSS, optimizer=OPTIMIZER, metrics=["accuracy"])

import hls4ml
import plotting

hls4ml.model.optimizer.get_optimizer('output_rounding_saturation_mode').configure(layers=['Activation'])
hls4ml.model.optimizer.get_optimizer('output_rounding_saturation_mode').configure(rounding_mode='AP_RND')
hls4ml.model.optimizer.get_optimizer('output_rounding_saturation_mode').configure(saturation_mode='AP_SAT')

hls_config = hls4ml.utils.config_from_keras_model(model, granularity='name')

Precision = 'ap_fixed<16,6>'
Reuse_Factor = 4

hls_config['Model']['Precision'] = Precision
hls_config['Model']['ReuseFactor'] = Reuse_Factor

for Layer in hls_config['LayerName'].keys():
    hls_config['LayerName'][Layer]['Precision'] = Precision
    hls_config['LayerName'][Layer]['Strategy'] = 'Latency'
    hls_config['LayerName'][Layer]['ReuseFactor'] = Reuse_Factor

hls_config['LayerName']['D2_softmax']['Strategy'] = 'Stable'

cfg = hls4ml.converters.create_config(backend='Vivado')
cfg['IOType']     = 'io_stream' # Must set this if using CNNs!
cfg['HLSConfig']  = hls_config
cfg['KerasModel'] = model
cfg['OutputDir']  = 'model_16b_rf4/'
cfg['XilinxPart'] = 'xcvu9p-flga2104-2L-e' #vcu118

plotting.print_dict(cfg)

hls_model = hls4ml.converters.keras_to_hls(cfg)
hls_model.compile()

hls_model.build(csim=False, export=True)

However whenever I try to build the project with hls4ml I always get an error as follows:

WARNING: [SYNCHK 200-77] The top function 'myproject' (firmware/myproject.cpp:24) has no outputs. Possible cause(s) are: (1) Output parameters are passed by value; (2) intended outputs (parameters or global variables) are never written; (3) there are infinite loops.
INFO: [SYNCHK 200-10] 0 error(s), 1 warning(s).
INFO: [HLS 200-111] Finished Checking Synthesizability Time (s): cpu = 00:01:08 ; elapsed = 00:01:10 . Memory (MB): peak = 1504.445 ; gain = 1099.754 ; free physical = 75765 ; free virtual = 119914
INFO: [XFORM 203-712] Applying dataflow to function 'myproject', detected/extracted 0 process function(s): .
INFO: [HLS 200-111] Finished Pre-synthesis Time (s): cpu = 00:01:09 ; elapsed = 00:01:11 . Memory (MB): peak = 1504.445 ; gain = 1099.754 ; free physical = 75706 ; free virtual = 119863
ERROR: [HLS 200-474] Empty dataflow region in myproject (it may have been optimized away due to the absence of outputs)
ERROR: [HLS 200-70] Failed building synthesis data model.
command 'ap_source' returned error code
    while executing
"source build_prj.tcl"
    ("uplevel" body line 1)
    invoked from within
"uplevel \#0 [list source $arg] "
jmduarte commented 1 year ago

@AnouarITI I am able to reproduce your error, but I am very confused about your model.

You seem to be treating a 1024-length time series as a 1024x1 image and applying Conv2D, when I think you should be applying Conv1D. Mathematically it should be equivalent, but maybe there's some issue with image widths/heights of size 1 in hls4ml.

Also, is there a reason you're doing two rounds of average pooling (to basically average over the entire time series) and then a flatten, instead of a single global average pooling?

Not sure if it works yet, but here are my suggestions: https://gist.github.com/jmduarte/122d705236844d0c40165ab278dd11a2

jmduarte commented 1 year ago

@AnouarITI, I see if I try the model, I get the "large loop unroll" issue which is really an HLS limitation.

So, I would try to increase the reuse factor and use "resource" strategy. So, I updated the gist: https://gist.github.com/jmduarte/122d705236844d0c40165ab278dd11a2