TTLayer.forward method uses only weight_t property, not weight. Also weight_t is not a reference to weight, it's a copy of weight, which is then transposed. Hence after training weight and weight_t have absolutely different values.
I suspect that either weight shouldn't be a torch parameter at all, or TensorTrain.transpose method should be changed so that it doesn't produce a copy.
Anyway, I attach a patch which fixes the problem of ttlayer.to(device) not correctly transferring it to that device. It's a somewhat dirty workaround, but still should be applied until a proper fix is made.
diff --git a/t3nsor/layers.py b/t3nsor/layers.py
index 65af3c3..c1346f1 100644
--- a/t3nsor/layers.py
+++ b/t3nsor/layers.py
@@ -110,8 +110,12 @@ class TTLinear(nn.Module):
self.shape = shape
self.weight = init.to_parameter()
- self.parameters = self.weight.parameter
- self.weight_t = t3.transpose(self.weight)
+ self.weight_parameter = self.weight.parameter
+ # actually weight probably shouldn't be assigned to self and
+ # shouldn't be a parameter because it doesn't participate in self.forward
+
+ self.weight_t = t3.transpose(self.weight).to_parameter()
+ self.weight_t_parameter = self.weight_t.parameter
if bias:
self.bias = torch.nn.Parameter(1e-2 * torch.ones(out_features))
TTLayer.forward
method uses onlyweight_t
property, notweight
. Alsoweight_t
is not a reference toweight
, it's a copy ofweight
, which is then transposed. Hence after trainingweight
andweight_t
have absolutely different values.I suspect that either
weight
shouldn't be a torch parameter at all, orTensorTrain.transpose
method should be changed so that it doesn't produce a copy.Anyway, I attach a patch which fixes the problem of
ttlayer.to(device)
not correctly transferring it to that device. It's a somewhat dirty workaround, but still should be applied until a proper fix is made.