KhrulkovV / tt-pytorch

59 stars 15 forks source link

In TTLayer weight_t must be declared as parameter, but it isn't #4

Closed philip-bl closed 5 years ago

philip-bl commented 5 years ago

TTLayer.forward method uses only weight_t property, not weight. Also weight_t is not a reference to weight, it's a copy of weight, which is then transposed. Hence after training weight and weight_t have absolutely different values.

I suspect that either weight shouldn't be a torch parameter at all, or TensorTrain.transpose method should be changed so that it doesn't produce a copy.

Anyway, I attach a patch which fixes the problem of ttlayer.to(device) not correctly transferring it to that device. It's a somewhat dirty workaround, but still should be applied until a proper fix is made.

diff --git a/t3nsor/layers.py b/t3nsor/layers.py
index 65af3c3..c1346f1 100644
--- a/t3nsor/layers.py
+++ b/t3nsor/layers.py
@@ -110,8 +110,12 @@ class TTLinear(nn.Module):

         self.shape = shape
         self.weight = init.to_parameter()
-        self.parameters = self.weight.parameter
-        self.weight_t = t3.transpose(self.weight)
+        self.weight_parameter = self.weight.parameter
+        # actually weight probably shouldn't be assigned to self and
+        # shouldn't be a parameter because it doesn't participate in self.forward
+
+        self.weight_t = t3.transpose(self.weight).to_parameter()
+        self.weight_t_parameter = self.weight_t.parameter

         if bias:
             self.bias = torch.nn.Parameter(1e-2 * torch.ones(out_features))
KhrulkovV commented 5 years ago

Thanks, fixed.