diviyank / SAM

Code for the Structural Agnostic Model (https://arxiv.org/abs/1803.04929)
Apache License 2.0
53 stars 10 forks source link

problem about pytorch inplace and can not training #5

Open zhougoodman opened 2 years ago

zhougoodman commented 2 years ago

it seems to be some issue with the code in sam.py forward function:

    if self.linear:
        output = self.input_layer(data, noise, adj_matrix * self.skeleton)

it change the graph cause autobackward could not work. but this just my surmise.

i think that pytorch version maybe a problem,would you provide your pytorch version? (i did not find the version you used in the code) or maybe it has something wrong in the other place, could you have a look on this problem? i would be grateful, thanks!

0%| | 0/11000 [00:00<?, ?it/s, disc=0.43, gen=-.373, regul_loss=0.719, tot=-2.64]Traceback (most recent call last): File "D:\PyCharm 2021.3.2\plugins\python\helpers\pydev\pydevd.py", line 1483, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "D:\PyCharm 2021.3.2\plugins\python\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "E:/study/pycharm_project/dzx_policy/SAM-master/est_sam.py", line 19, in m.predict(data, nruns=1, ) File "E:\study\pycharm_project\dzx_policy\SAM-master\sam\sam.py", line 352, in predict device='cuda:0' if gpus else 'cpu') File "E:\study\pycharm_project\dzx_policy\SAM-master\sam\sam.py", line 232, in run_SAM loss.backward(retain_graph=True) File "E:\study\pycharm_project\mental_bert\bert\lib\site-packages\torch\tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "E:\study\pycharm_project\mental_bert\bert\lib\site-packages\torch\autograd__init__.py", line 147, in backward allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [200, 1]], which is output 0 of TBackward, is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Christopher7622 commented 10 months ago

Do y solve this problem? I also encountered this problem.

zhougoodman commented 10 months ago

Do y solve this problem? I also encountered this problem.

no i didn't solve this problem, and i used causalnex instead

Christopher7622 commented 10 months ago

Do you mean causalnex module has the same algorithm as SAMv1/SAM?

发自我的iPhone

------------------ Original ------------------ From: zhougoodman @.> Date: Mon,Dec 25,2023 8:30 PM To: Diviyan-Kalainathan/SAM @.> Cc: Christopher7622 @.>, Comment @.> Subject: Re: [Diviyan-Kalainathan/SAM] problem about pytorch inplace and cannot training (Issue #5)

Do y solve this problem? I also encountered this problem.

no i didn't solve this problem, and i used causalnex instead

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

zhougoodman commented 10 months ago

Do you mean causalnex module has the same algorithm as SAMv1/SAM?

no, I think they are different algorithms. But my goal at that time was to find a causal inference algorithm, and that's it.

Christopher7622 commented 10 months ago

thanks so  much ,bro 

发自我的iPhone

------------------ Original ------------------ From: zhougoodman @.> Date: Mon,Dec 25,2023 8:38 PM To: Diviyan-Kalainathan/SAM @.> Cc: Christopher7622 @.>, Comment @.> Subject: Re: [Diviyan-Kalainathan/SAM] problem about pytorch inplace and cannot training (Issue #5)

Do you mean causalnex module has the same algorithm as SAMv1/SAM?

no, I think they are different algorithms. But my goal at that time was to find a causal inference algorithm, and that's it.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

zfnaixuexi commented 8 months ago

Do y solve this problem? I also encountered this problem.

I meet the same problem.Can you give me some advice?

Christopher7622 commented 8 months ago

use docker may  solve your problem

发自我的iPhone

------------------ Original ------------------ From: zfnaixuexi @.> Date: Tue,Mar 5,2024 11:51 AM To: Diviyan-Kalainathan/SAM @.> Cc: Christopher7622 @.>, Comment @.> Subject: Re: [Diviyan-Kalainathan/SAM] problem about pytorch inplace and cannot training (Issue #5)

Do y solve this problem? I also encountered this problem.

I meet the same problem.Can you give me some advice?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

zfnaixuexi commented 8 months ago

Thanks for your advice. However,it does not solve my problem.

------------------ 原始邮件 ------------------ 发件人: "Diviyan-Kalainathan/SAM" @.>; 发送时间: 2024年3月5日(星期二) 中午12:43 @.>; @.**@.>; 主题: Re: [Diviyan-Kalainathan/SAM] problem about pytorch inplace and can not training (Issue #5)

use docker may &nbsp;solve your problem

发自我的iPhone

------------------ Original ------------------ From: zfnaixuexi @.&gt; Date: Tue,Mar 5,2024 11:51 AM To: Diviyan-Kalainathan/SAM @.&gt; Cc: Christopher7622 @.&gt;, Comment @.&gt; Subject: Re: [Diviyan-Kalainathan/SAM] problem about pytorch inplace and cannot training (Issue #5)

Do y solve this problem? I also encountered this problem.

I meet the same problem.Can you give me some advice?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.&gt; — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>