Closed JackTemaki closed 2 years ago
A
loss.mark_as_loss()
call inside a module causes the calling of
self.name_ctx.make_all_sub_networks_and_optimize()
ctx._make_sub_network_layer(ctx_.layer_ref)
nn.copy(sub_output, name=self.get_child("output"))
Which means that the "output" layer is already created there, thus causing the assert to fail when it is tried to create the same layer later again via
nn.scoped
.
It looks like this is a bug of mark_as_loss
, or maybe make_all_sub_networks_and_optimize
.
Can you make a simple test case? In any case, this is sth which should always work.
Maybe this error already vanishes with #160 ?
Maybe but anyway make such test case that we always test this.
Do we have a test case now?
Note that #160 is merged now. While this might have resolved the particular error you were getting, I'm not sure though that the behavior of mark_as_loss
is correct. Specifically, if you have a loss only for a sublayer, I'm not sure that RETURNN will really find it always. It might find it when it will create the subnetwork layer for other unrelated reasons, but of course you should not rely on this. We maybe should copy a similar logic as in mask_as_output
, that it will have a reference (nn.copy
) in the root always. For code in a loop, we need some extra care, like accumulating it automatically. I'm not sure what to do inside a cond.
Ok, I now pushed sth which should fix the loss in sublayer issue.
If you now run into other errors, please open new issues on that (maybe referencing this one here).
A
loss.mark_as_loss()
call inside a module causes the calling ofself.name_ctx.make_all_sub_networks_and_optimize()
ctx._make_sub_network_layer(ctx_.layer_ref)
nn.copy(sub_output, name=self.get_child("output"))
Which means that the "output" layer is already created there, thus causing the assert to fail when it is tried to create the same layer later again via
nn.scoped
.Maybe this error already vanishes with #160 ?
And yes, I could define all losses outside the Modules, but @Atticus1806 and I discussed this and we prefer to have also losses inside, as especially for TTS it makes more sense to have them inside, because the number and type of losses depend on the model hierarchy.
Relevant Stacktrace of the Error:
Relevant Code:
And