Train Error - Githubissues

ztx313 commented 2 years ago

Hi! I want to train anew model, but there is a error: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1200]] is at version 3; expected version 2 instead.

def update_policy_immediately(adjust_loss, optimizer, te_losses=0): optimizer.zero_grad() adjust_loss.backward(retain_graph=True) # -----------------------------the error line----------------------------------------------- optimizer.step() return adjust_loss.item()#, te_losses

I print the adjust_loss, the result is the following: tensor(5.4996e-06, device='cuda:0', grad_fn=)
tensor(0., device='cuda:0', grad_fn=) tensor(4.5414, device='cuda:0', grad_fn=) tensor(0.9992, device='cuda:0', grad_fn=) [W ..\torch\csrc\autograd\python_anomaly_mode.cpp:104] Warning: Error detected in ThnnFusedLstmCellBackward.

how can I solve it? thank you

tjh1997 commented 2 years ago

Hi, I found the same error! Could you share your solution to the problem? Thanks.

ztx313 commented 2 years ago

你好，应该是国人兄弟吧，我就直接发中文了，我当时好像下载作者给的文件，然后在执行python命令时修改相应的路径名解决了，或者是环境问题？时间有点远了，不太记得了，另外关于SPARQL_test.py测试问题，我还没有解决，目前在看其他论文的工作。

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2022年4月21日(星期四) 晚上6:43 收件人: @.>; 抄送: @.>; "State @.>; 主题: Re: [lanyunshi/ConversationalKBQA] Train Error (Issue #6)

Hi, I found the same error! Could you share your solution to the problem? Thanks.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>

tjh1997 commented 2 years ago

Thanks for your reply! After I downgraded the PyTorch to v1.4.0, the error disappeared. As for the timeout problem in SPARQL_test.py, you can solve it with the help of VPN.

lanyunshi / ConversationalKBQA

Train Error #6