rwth-i6 / returnn_common

Common building blocks for RETURNN configs, such as models, training concepts, etc
7 stars 4 forks source link

Dim description/naming better, easier, more intuitive #236

Open albertz opened 2 years ago

albertz commented 2 years ago

You often find code like:

    out_spatial_dim = nn.SpatialDim(f"{nn.NameCtx.current_ctx().get_abs_name()}:spatial")

Or:

    out_dim = nn.SpatialDim(f"{_name_str(name, 'random_state_init')}:out_dim")

...

def _name_str(name: Optional[Union[str, nn.NameCtx]], default: str) -> str:
  if name is None or isinstance(name, str):
    return f'{nn.NameCtx.current_ctx().get_abs_name()}:{name or default}'
  if isinstance(name, nn.NameCtx):
    return name.get_abs_name()
  raise TypeError(f'name type {type(name)} not supported')

This is when this dim tag is somewhere created internally in the model, maybe in some layer, to give a more unique name when multiple such layers are used, that you can distinguish the different dim tags in some debug output.

But it's also not used consistently. In other places, you often find just code like:

      num_heads = nn.SpatialDim("num_heads", num_heads)

Which then leads to the problem that you likely have many such num_heads dim tags in your model which all have the same description.

I don't really have a good solution. The first variant using nn.NameCtx.current_ctx().get_abs_name() is too complicated and also the generated dim names (descriptions) are maybe too verbose and too long. The second variant can lead to confusion and ambiguity to the user.

It would be nice if the code could somehow be as simple as the second variant, but also avoiding the ambiguity problem.