Open beckybai opened 5 years ago
Did you solve the issue?
I'm working on implementing GradNorm on Apex. And I also had this issue. My training stuck when I access the model.parameters()
Here is the detail code
print("stuck")
sys.stdout.flush()
# Getting gradients of the first layers of each tower and calculate their l2-norm
param = list(model.parameters())
print(param[0])
sys.stdout.flush() # param[0] is not shown.
G0R = torch.autograd.grad(l0, param[0], retain_graph=True, create_graph=True)
G0 = torch.norm(G0R[0], 2)
print("stuck0")
sys.stdout.flush()
G1R = torch.autograd.grad(l1, param[0], retain_graph=True, create_graph=True)
G1 = torch.norm(G1R[0], 2)
print("stuck1")
sys.stdout.flush()
G2R = torch.autograd.grad(l2, param[0], retain_graph=True, create_graph=True)
G2 = torch.norm(G2R[0], 2)
print("stuck2")
sys.stdout.flush()
@noirmist Hey, did you solve the problem?
All, I solve the problem; Linked the issue #457
This code leads to an attributeError.
Main reason is that no explicit feedforward in this code, so there is no chance to collect the parameters.