Overflow error: unsigned conversion from ‘int’ to ‘short unsigned int’

vandenBergArthur commented 1 year ago

Hi all,

Before I got stuck with my issue listed at #745, I was having 2 major problems. The models used to test are only the beginning parts of a bigger more complex model. Because instead of compiling & building the large complex model in 1 go, I try to work upward step by step.

Problem 1: Unable to build a very small hls model.

a = Input(shape=(10,9,25))

b = Permute((2,3,1))(a)

c = Conv2D(filters=10, kernel_size=1,data_format='channels_last')(b)

model = Model(inputs=a, outputs=c, name='input_permute_conv2d_model')

config = hls4ml.utils.config_from_keras_model(model, granularity='model')

config['Model']['Precision'] = 'ap_fixed<16,6>'
config['Model']['ReuseFactor'] = 5
config['Model']['Strategy'] = 'Resource'

cfg = hls4ml.converters.create_config(backend='Vivado')
cfg['IOType']     = 'io_parallel'
cfg['HLSConfig']  = config
cfg['KerasModel'] = model
cfg['XilinxPart'] = 'xc7z020clg400-1'

hls_model = hls4ml.converters.keras_to_hls(cfg)
hls_model.compile()

The output of hls_model.build(csim=False) looks like this:

vivado_hls.log

Problem 2: An overflow issue when compiling my hls model. In Discussions i found the following: #363. Would this be the reason of my error? I so, solving my problem listed at #745 would probably solve this problem (I assume) since the hls4ml I'm using is out of date (but at least this one works for me).

The model is defined as follows:

input_shape_x = (64,9,25)
input_x = Input(shape=input_shape_x, name='input_x')    
a = Permute((2,3,1))(input_x)

self_conv1 = Conv2D(filters=128, kernel_size=1,data_format='channels_last')(a)
self_conv2 = Conv2D(filters=128, kernel_size=1,data_format='channels_last')(a)
self_conv3 = Conv2D(filters=128, kernel_size=1,data_format='channels_last')(a)

b = Concatenate(axis=-2)([self_conv1, self_conv2])
c = Concatenate(axis=-2)([b, self_conv3])

model = Model(inputs=input_x, outputs=c, name='part1_model')

config = hls4ml.utils.config_from_keras_model(model, granularity='model')

config['Model']['Precision'] = 'ap_fixed<16,6>'
config['Model']['ReuseFactor'] = 2
config['Model']['Strategy'] = 'Resource'

cfg = hls4ml.converters.create_config(backend='Vivado')
cfg['IOType']     = 'io_parallel'
cfg['HLSConfig']  = config
cfg['KerasModel'] = model
cfg['XilinxPart'] = 'xc7z020clg400-1'

hls_model = hls4ml.converters.keras_to_hls(cfg)
hls_model.compile()

The code above results in the following error:

firmware/myproject.cpp: In function ‘void myproject(input_t*, result_t*, short unsigned int&, short unsigned int&)’:
firmware/myproject.cpp:38:55: warning: unsigned conversion from ‘int’ to ‘short unsigned int’ changes value from ‘86400’ to ‘20864’ [-Woverflow]
   38 |     const_size_out_1 = OUT_CONCAT_0_10*OUT_CONCAT_1_10*OUT_CONCAT_2_10;

I am running Ubuntu 20.04, installed the conda environment that's listed in the tutorials page and Vivado 2019.2.

Any help would be very appreciated!

Thanks in advance!

vloncar commented 1 year ago

The log of problem 1 says the layer is to big to be unrolled. Use the main branch or wait for the upcoming 0.7.0 release. Convolutional layers with io_parallel in 0.6.0 don't work. For problem 2, even with the latest branch you won't have much success, 128 filters is too much. Generally, keep all activation tensors and weights in the lower end of the order of O(1000) (preferably O(100)) to have any chance of success with io_parallel.

vandenBergArthur commented 1 year ago

The log of problem 1 says the layer is to big to be unrolled. Use the main branch or wait for the upcoming 0.7.0 release. Convolutional layers with io_parallel in 0.6.0 don't work. For problem 2, even with the latest branch you won't have much success, 128 filters is too much. Generally, keep all activation tensors and weights in the lower end of the order of O(1000) (preferably O(100)) to have any chance of success with io_parallel.

Hi @vloncar , first and foremost thanks for your input!

The amount of filters will be scaled down, thanks for pointing that out. I am aware that a Conv2D layer requires io_stream but, compiling the hls model will fail. And I believe it's because of the Permute layer. To be more specific, I think it might be related to issue #712.

To demonstrate, I created a very simple model:

a = Input(shape=(10,9,25))
b = Permute((2,3,1))(a)
c = Conv2D(filters=10, kernel_size=1,data_format='channels_last')(b)

model = Model(inputs=a, outputs=c, name='input_permute_conv2d_model')
model.summary()

config = hls4ml.utils.config_from_keras_model(model, granularity='model')

config['Model']['Precision'] = 'ap_fixed<16,6>'
config['Model']['ReuseFactor'] = 10
config['Model']['Strategy'] = 'Resource'
cfg = hls4ml.converters.create_config(backend='Vivado')
cfg['IOType']     = 'io_stream'
cfg['HLSConfig']  = config
cfg['KerasModel'] = model
cfg['XilinxPart'] = 'xc7z020clg400-1'

hls_model = hls4ml.converters.keras_to_hls(cfg)

When compiling hls_model.compile() the following output is generated:

firmware/myproject.cpp: In function ‘void myproject(hls::stream<nnet::array<ap_fixed<16, 6>, 25> >&, hls::stream<nnet::array<ap_fixed<16, 6>, 10> >&)’:
firmware/myproject.cpp:51:52: error: cannot convert ‘hls::stream<nnet::array<ap_fixed<16, 6>, 25> >’ to ‘nnet::array<ap_fixed<16, 6>, 25>*’
   51 |     nnet::transpose_3d<input_t, layer2_t, config2>(input_1, layer2_out); // permute
      |                                                    ^~~~~~~
      |                                                    |
      |                                                    hls::stream<nnet::array<ap_fixed<16, 6>, 25> >
In file included from firmware/parameters.h:10,
                 from firmware/myproject.cpp:22:
firmware/nnet_utils/nnet_array.h:27:26: note:   initializing argument 1 of ‘void nnet::transpose_3d(data_T*, res_T*) [with data_T = nnet::array<ap_fixed<16, 6>, 25>; res_T = nnet::array<ap_fixed<16, 6>, 10>; CONFIG_T = config2]’
   27 | void transpose_3d(data_T data[CONFIG_T::depth * CONFIG_T::height * CONFIG_T::width],
      |                   ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
firmware/nnet_utils/nnet_array.h: In instantiation of ‘void nnet::transpose_3d(data_T*, res_T*) [with data_T = nnet::array<ap_fixed<16, 6>, 25>; res_T = nnet::array<ap_fixed<16, 6>, 10>; CONFIG_T = config2]’:
firmware/myproject.cpp:51:71:   required from here
firmware/nnet_utils/nnet_array.h:43:92: error: no match for ‘operator=’ (operand types are ‘nnet::array<ap_fixed<16, 6>, 10>’ and ‘nnet::array<ap_fixed<16, 6>, 25>’)
   43 |                 data_t[idx_t[0] * dims_t[1] * dims_t[2] + idx_t[1] * dims_t[2] + idx_t[2]] =
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
   44 |                     data[idx[0] * dims[1] * dims[2] + idx[1] * dims[2] + idx[2]];
      |                     ~~~~~                                                                   
In file included from firmware/defines.h:6,
                 from firmware/myproject.h:8,
                 from firmware/myproject.cpp:21:
firmware/nnet_utils/nnet_types.h:21:12: note: candidate: ‘nnet::array<T, N>& nnet::array<T, N>::operator=(const nnet::array<T, N>&) [with T = ap_fixed<16, 6>; unsigned int N = 10]’
   21 |     array &operator=(const array &other) {
      |            ^~~~~~~~
firmware/nnet_utils/nnet_types.h:21:35: note:   no known conversion for argument 1 from ‘nnet::array<ap_fixed<16, 6>, 25>’ to ‘const nnet::array<ap_fixed<16, 6>, 10>&’
   21 |     array &operator=(const array &other) {
      |                      ~~~~~~~~~~~~~^~~~~
g++: error: myproject.o: No such file or directory

Which is exactly the same as in #712.

Does this indicate that creating a model with Conv2D layer(s) and a Permute layer using io_stream is just impossible at this moment?
Is the only solution to severely reduce the amount of filters and use io_parallel?

I look forward to your reply.

vloncar commented 1 year ago

transposing a 3d tensor is not supported. it is an expensive operation and you wouldn't want it anyway. You're transposing the input, think about doing that outside of your model.

fastmachinelearning / hls4ml

Overflow error: unsigned conversion from ‘int’ to ‘short unsigned int’ #746