Open caydenwei opened 10 months ago
I think it should be
self.optim.zero_grad()
loss.backward()
with torch.no_grad():
self.optim.step()
I think it should be
self.optim.zero_grad() loss.backward() with torch.no_grad(): self.optim.step()
Apologies for not being clear earlier. I'm uncertain about the correctness of a specific part of the code found at https://github.com/d2l-ai/d2l-en/blob/master/d2l/torch.py. Here is the original code:
def fit_epoch(self):
"""Defined in :numref:`sec_linear_scratch`"""
self.model.train()
for batch in self.train_dataloader:
loss = self.model.training_step(self.prepare_batch(batch))
self.optim.zero_grad()
with torch.no_grad():
loss.backward()
if self.gradient_clip_val > 0: # To be discussed later
self.clip_gradients(self.gradient_clip_val, self.model)
self.optim.step()
self.train_batch_idx += 1
if self.val_dataloader is None:
return
self.model.eval()
for batch in self.val_dataloader:
with torch.no_grad():
self.model.validation_step(self.prepare_batch(batch))
self.val_batch_idx += 1
Same question. I'm confused why the code still works in the examples (LeNet, etc).
Hello,
I noticed a potential issue in the fit_epoch method in https://github.com/d2l-ai/d2l-en/blob/master/d2l/torch.py, where loss.backward() is called within a torch.no_grad() block:
This usage likely prevents the calculation of gradients, as loss.backward() should not be inside a torch.no_grad() block. The correct approach would be:
Here is the original code: