Decoder output == nan - Githubissues

chenyuqi990215 / RNTrajRec

Road Network Enhanced Trajectory Recovery with Spatial-Temporal Transformer (ICDE'23)

MIT License

31 stars 9 forks source link

Decoder output == nan #14

Open rui-love opened 8 months ago

rui-love commented 8 months ago

When I use my dataset to train the model, the decoder output all are nan. The only thing I do is changing the data and road net. And I get the results from the 100 example trajectories.

chenyuqi990215 commented 8 months ago

can you provide more details, so that I can help you? Do the parameters contain nan after the gradient descend? or Do the inputs contain nan? or Do the log operations cause nan?

rui-love commented 8 months ago

Thank you very much. The parameters don't contain any nan value. The input are all 0. When I use the torch.autograd.set_detect_anomaly(True) May be the log operations cause nan? How can I deal with it?

rui-love commented 8 months ago

sorry for the waiting. I have reply the detail of my experiment on the github issues.

-----原始邮件----- 发件人:"Yuqi Chen" @.> 发送时间:2023-12-27 16:28:25 (星期三) 收件人: chenyuqi990215/RNTrajRec @.> 抄送: 瑞の爱 @.>, Author @.> 主题: Re: [chenyuqi990215/RNTrajRec] Decoder output == nan (Issue #14)

can you provide more details, so that I can help you? Do the parameters contain nan after the gradient descend? or Do the inputs contain nan? or Do the log operations cause nan?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

rui-love commented 8 months ago

hello! do you need more information to deal with this problem？

chenyuqi990215 commented 8 months ago

Maybe you can set 'search_dist' and 'neighbor_dist' in multi_main.py to a large enough values which is larger than the maximum GPS error of your dataset.

rui-love commented 8 months ago

There is the same situation, is there any other parameters should be care about?

maxwang967 commented 8 months ago

it happens when i'm using a alternative dataset, while it's ok on other baseline models.

maxwang967 commented 8 months ago

There is the same situation, is there any other parameters should be care about?

The issue can be sovled by add a small constant (e.g., 1e-6) to 'x_exp_sum' in both mask_log_softmax funciton and mask_graph_log_softmax function.

maxwang967 commented 8 months ago

There is the same situation, is there any other parameters should be care about?

The issue can be sovled by add a small constant (e.g., 1e-6) to 'x_exp_sum' in both mask_log_softmax funciton and mask_graph_log_softmax function.

@chenyuqi990215 Please consider modify the source code to make it more robust, thank you!

chenyuqi990215 commented 8 months ago

Thank you for your advice.