`ReduceLayer` on sparse inputs behaves incorrect

rwth-i6 / returnn

The RWTH extensible training framework for universal recurrent neural networks

http://returnn.readthedocs.io/

Other

349 stars 130 forks source link

`ReduceLayer` on sparse inputs behaves incorrect #743

Closed albertz closed 3 years ago

albertz commented 3 years ago

E.g. if you have a sparse [B,T] which is then actually virtually [B,T,F]. and then you e.g. reduce_max the T axis, I think it will reduce on the vocab ids currently. But instead it should make a dense [B,T,F] where F is just the max of the one-hot vectors before. And probably also in many other cases. Like also DotLayer in #741.

Originally posted by @Zettelkasten in https://github.com/rwth-i6/returnn/issues/741#issuecomment-963053472

albertz commented 3 years ago

I would propose to just throw an error if ReduceLayer is applied on sparse inputs.

albertz commented 3 years ago

Actually this is already the case:

assert not input_data.sparse

So this can be closed.