Convolution over feature dim

vieting commented 1 year ago

As discussed in #134, it is currently not possible to do a convolution over the axis which RETURNN considers the feature dim and it would be helpful to set in_dim. However, then the basic _get_output_shape_from_returnn fails because there, the new feature dim is mapped to the old feature dim and as a result, the remaining dims are also mapped incorrectly. I add a suggestion in this PR.

vieting commented 1 year ago

The failing tests are the ones from https://github.com/rwth-i6/pytorch-to-returnn/issues/125.

vieting commented 1 year ago

If we do this change, I have a follow up issue. The test cases work, but if I write the conversion result to a file via converter.get_returnn_config_serialized(), the dim tags will show up at the beginning and are used in the network as well. Here is an example:

``` import numpy from returnn.tf.util.data import Dim, batch_dim, single_step_dim, SpatialDim, FeatureDim use_tensorflow = True behavior_version = 12 feature_data_dim = FeatureDim('feature:data', 11) time_data_dim = SpatialDim('time:data') spatial1_data_dim = SpatialDim('spatial1:data') Linear_feature_dense_dim = FeatureDim('Linear:feature-dense', 7) extern_data = { 'data': { 'dim_tags': [ batch_dim, feature_data_dim, time_data_dim, spatial1_data_dim ], 'dtype': 'float32', 'time_dim_axis': 2, 'feature_dim_axis': 1 } } network = { 'Transpose': {'class': 'copy', 'from': 'data'}, 'Linear': { 'class': 'linear', 'from': 'Transpose', 'n_out': 7, 'with_bias': True, 'activation': None }, 'Transpose_1': {'class': 'copy', 'from': 'Linear'}, 'Conv2d': { 'class': 'conv', 'from': 'Transpose_1', 'activation': None, 'with_bias': True, 'n_out': 13, 'filter_size': (3, 5), 'padding': 'valid', 'in_spatial_dims': [ time_data_dim, spatial1_data_dim ], 'in_dim': Linear_feature_dense_dim, 'strides': (2, 2) }, 'output': {'class': 'copy', 'from': 'Conv2d'} } ```

However, Linear_feature_dense_dim in the example is not identical with the dim tag in the input. We would have to set the output dim tags also I guess, right? Are there already solutions for this in returnn_common or elsewhere?

albertz commented 1 year ago

You need to change all n_out to out_dim. This is for LinearLayer but also other layers like ConvLayer.

You also need to specify out_spatial_dims for ConvLayer.

vieting commented 1 year ago

I need to create them in create_returnn_layer_dict and there is no way to infer them, right?

And the printed config will possibly have LOTS of dim tags at the beginning. I guess that's similar in returnn common, right?

rwth-i6 / pytorch-to-returnn

Convolution over feature dim #135