fastmachinelearning / hls4ml

Machine learning on FPGAs using HLS
https://fastmachinelearning.org/hls4ml
Apache License 2.0
1.18k stars 388 forks source link

Rebased version the PR to add support for ConvTranspose layers #844

Open jmitrevs opened 11 months ago

jmitrevs commented 11 months ago

Description

This is a rebased version of #644. Please look there for details, but in summary it adds support for both io_stream and io_parallel compilation of Conv1DTranspose and Conv2DTranspose.

Type of change

Tests

Still a to-do.

Checklist

vloncar commented 11 months ago

Why do we need multidimensional weights here? Passing pointers doesn't work as expected? I like the idea, we'll need it in the future for Vitis, but the keep_dims approach feels very clunky.

jmitrevs commented 11 months ago

I pushed a first version of a test and tried to make initial fixes, though there are many remaining errors.

jmitrevs commented 11 months ago

Looking at the new test, the results see to be:

y_keras.shape=(10, 7, 7, 4)
y_hls4ml.shape=(10, 100)

and model_default_t w2[1][1][108]; seem to be the weight dimensions. Will need to investigate further.

jmitrevs commented 11 months ago

The output dimensions inconsistency is for padding = "valid". For "same" the output is flattened but there are the correct number of values. However, they still don't match.

jmitrevs commented 10 months ago

I am having trouble getting this to work. I wrote a python implementation of conv2d_transpose that works and I think I understand the im2col style io_parallel implementation that's here, but it's not coming together. The keep_dim weights that are written out actually look strangely reordered. I may branch and try to use the code here while dropping the keep_dim concept.

Jonathan-Shoemaker commented 10 months ago

I am having trouble getting this to work. I wrote a python implementation of conv2d_transpose that works and I think I understand the im2col style io_parallel implementation that's here, but it's not coming together. The keep_dim weights that are written out actually look strangely reordered. I may branch and try to use the code here while dropping the keep_dim concept.

One of the purposes of keep_dims is that the weights are purposefully written in a different order than in the original order of the conv_transpose weight matrix. Purpose of this is that the kernel of conv transpose is better computed as separate kernels of smaller conv layers. These kernels are within the transpose kernel but are not continuous in the normal transpose matrix output.

jmitrevs commented 10 months ago

With these few changes conv2dtranspose seems to work for io_parallel. @Jonathan-Shoemaker, do you want to take a look at why the other cases in the pytest are failing?

jmitrevs commented 10 months ago

Actually, the conv1dtranspose io_parallel test was due to the pytest being misconfigured, but the io_stream tests are still failing.

jmitrevs commented 10 months ago

I think the issue is with padding. With padding="same" it seems to work well.

jmitrevs commented 10 months ago

Also needed is checking for valid reuse factors for conv transpose.