Closed albertz closed 2 years ago
Still WIP but most of it looks ok already.
Most failing tests are because the naming logic changed slightly. But I think it's not optimal yet, and what the tests are checking is right, i.e. in those cases we want the old behavior. I will see how to get that.
I just realized, the deepcopy
logic is even more problematic, as it is implemented currently here. Also see #109 for discussion on that.
Now, I think nn.Parameter
will not correctly be copied. It will create a copy of nn.Tensor
and nn.Parameter
, however, inside the object, the name_ctx
will not be copied (due to new NameCtx.__deepcopy__
). This leads to unexpected und buggy behavior.
How to solve this? Again custom __copy__
and __deepcopy__
also for nn.Tensor
and nn.Parameter
? As mentioned before, nn.Tensor
is immutable, so it should not do anything. But nn.Parameter
must get a new copy.
Or change NameCtx.__deepcopy__
? But this is probably even more difficult, and we anyway need the special logic for the tensors.
Or Module.__deepcopy__
? But how? But also, it would be unexpected for the user when copying non-module objects (consisting of tensors or so) behave different. So this is also not really an option.
Fix #212. Fix #109.
And a bunch of semi related other fixes.
This will be rebased and merged, not squashed.