Closed albertz closed 2 years ago
@Zettelkasten maybe suggestions?
I think we already should add the support on RETURNN side.
However, often the tensor (constant, random, whatever) is then later going to be used in combination with some original tensor where the shape comes from. For efficiency, we would want that the order is already the right order right from the beginning. So nn.constant(..., shape=x.shape)
really should yield a constant with the same order of dims.
How to solve this? There could be a special ShapeFromLayer
wrapper object which RETURNN would detect (in transform_config_dict
already) and then we can pass this for x.shape
. But the x.shape
attrib should not be the set
but somehow a proxy containing a ref back to x
. Or we introduce a separate x.shape_ref
or so for this purpose.
I introduced a Tensor.shape_ordered
property now. For now, it just returns data.dim_tags
but we can later change this to some special ShapeFromTensor
object or so.
nn.constant
andnn.random_uniform
and related functions,nn.reduce
require theshape
argument to be ordered (list, tuple). (Same fornn.Parameter
, but we keep that now as a separate special case.)This is inconsistent to almost everywhere else where
shape
is unordered (set), on purpose because the order should never matter and up to RETURNN.Inconsistency is not the only problem though.
Often you have given some tensor
x
, and then you maybe want to create a constant of the same shape, likenn.zeros_like
. The canonical codenn.constant(..., shape=x.shape)
does not work currently becausenn.constant
requiresshape
to be ordered, whilex.shape
is a set.What are the options:
nn.zeros_like
currently usesx.data.dim_tags
, which is ordered. But I'm not sure this is a good solution. First, I'm not sure it is even a good idea that we exposex.data
directly. We might want to be able to change this maybe. Also, it might not be reliable or deterministic. This order is determined when the returnn-common model construction code runs (maybe on CPU) but later (maybe on GPU) it would have been different.nn.as_ordered_dims(...)
function or so, which introduces some ordering, by some heuristics.nn.constant
etc would accept an unordered set of dims but then internally it usesnn.as_ordered_dims(...)
to make it ordered.