rwth-i6 / returnn_common

Common building blocks for RETURNN configs, such as models, training concepts, etc
7 stars 4 forks source link

constant, random, reduce: shape requires ordering #138

Closed albertz closed 2 years ago

albertz commented 2 years ago

nn.constant and nn.random_uniform and related functions, nn.reduce require the shape argument to be ordered (list, tuple). (Same for nn.Parameter, but we keep that now as a separate special case.)

This is inconsistent to almost everywhere else where shape is unordered (set), on purpose because the order should never matter and up to RETURNN.

Inconsistency is not the only problem though.

Often you have given some tensor x, and then you maybe want to create a constant of the same shape, like nn.zeros_like. The canonical code nn.constant(..., shape=x.shape) does not work currently because nn.constant requires shape to be ordered, while x.shape is a set.

What are the options:

albertz commented 2 years ago

@Zettelkasten maybe suggestions?

albertz commented 2 years ago

I think we already should add the support on RETURNN side.

albertz commented 2 years ago

However, often the tensor (constant, random, whatever) is then later going to be used in combination with some original tensor where the shape comes from. For efficiency, we would want that the order is already the right order right from the beginning. So nn.constant(..., shape=x.shape) really should yield a constant with the same order of dims.

How to solve this? There could be a special ShapeFromLayer wrapper object which RETURNN would detect (in transform_config_dict already) and then we can pass this for x.shape. But the x.shape attrib should not be the set but somehow a proxy containing a ref back to x. Or we introduce a separate x.shape_ref or so for this purpose.

albertz commented 2 years ago

I introduced a Tensor.shape_ordered property now. For now, it just returns data.dim_tags but we can later change this to some special ShapeFromTensor object or so.