Open jelleopard opened 2 months ago
Please manually check if dataset has NaN, And lower the learning rate might be helpful.
Please manually check if dataset has NaN, And lower the learning rate might be helpful.
Thanks for your reply. Using your suggestion to lower the learning rate(1e-3-->1e-6) has some effect, but as the number of iterations increases, it is NAN again. I found that NAN appears in the input, which causes NAN to appear in the output. Is there something wrong with the generated depth map?
def train(model, device, dataloader, optimizer, epoch, writer):
model.train()
losses = []
criterion = SMAPELoss().to(device)
for (inputs, targets) in dataloader:
optimizer.zero_grad(set_to_none=True)
inputs = inputs.to(device, non_blocking=True)
targets = targets.to(device, non_blocking=True)
outputs = model(inputs)
loss = criterion(outputs, targets)
if np.isnan(loss.item()):
print(torch.isnan(inputs).all()) # False if there is a NAN
loss.backward()
optimizer.step()
losses.append(loss.item())
writer.add_scalar("Loss/total_train", np.mean(losses), epoch)
print("Loss: %f" % np.mean(losses))
Thanks to the author for the open source code. The following problems occurred during the reproduction process. Please help. Thank you~
The environment used is:
Use the following code to generate the corresponding depth map:
Generate acc_colors using data_preprocess.py
The data format is as follows:
The dataset.py file is modified as follows:
Running train.py results in the following:
What's the problem? need your help. thx.