isds-neu / PhyCRNet

Physics-informed convolutional-recurrent neural networks for solving spatiotemporal PDEs
MIT License
108 stars 34 forks source link

There are a couple of questions about accuracy #14

Open ZincJT opened 10 months ago

ZincJT commented 10 months ago
    Hello author, I have read your paper carefully and also reproduced the code. Following the steps you responded to in the other question, I ran the code. 
    First, I set time_batch_size=100, time_steps=101, and n_iters_adam=5000. I trained this network to get checkpoint100.pt, I tested step 100 with an accuracy of u=0.5% and v=0.8%. But when I went to get checkpoint200.pt, the accuracy was only u=7% and v=3.5%. Is there something wrong with me that leads to the low accuracy? Because I feel that when time_batch_size=1000, the accuracy will be worse and cannot reach the accuracy in the paper.
    I feel that there is something wrong with my running the code step, and I deleted it when time_batch_size=100
    model, optimizer, scheduler = load_checkpoint(model, optimizer, scheduler, pre_model_save_path)
    At time_batch_size=200, I use checkpoint100.pt to load the model to train checkpoint200.pt. And added the code above.
    Can you help me see if there is a problem with the code running steps! At the same time, I set the learning rate to 1e-3, whether the learning rate also affects the result.Thank you for helping me solve these problems, thank you very much!
paulpuren commented 10 months ago

Hi, I think the learning rate may be the issue for your test. Could you decrease it to 9e-4 or something like that?

ZincJT commented 10 months ago

Hello author, I would like to ask why the initialization parameters of the burgers equation are set this way in the def initialize_weights function. Can you tell me how you got it? I found that I replaced some of the other initialization Settings with poor results. Thank you very much for your help!!!!!!

YHzhang2021 commented 9 months ago

Hello, author. I read your article carefully and then tried to reproduce your code. When I train it with 100 time-step arithmetic examples, the training error is basically around 1e-1, while it can easily reach 1e-5~1e-3 in the results given in your article. can you provide a little advice? Thank you for helping me solve these problems, thank you very much! train loss x=32,y=32

ZincJT commented 7 months ago
    Hello author, I have read your paper carefully and also reproduced the code. Following the steps you responded to in the other question, I ran the code. 
    First, I set time_batch_size=100, time_steps=101, and n_iters_adam=5000. I trained this network to get checkpoint100.pt, I tested step 100 with an accuracy of u=0.5% and v=0.8%. But when I went to get checkpoint200.pt, the accuracy was only u=7% and v=3.5%. Is there something wrong with me that leads to the low accuracy? Because I feel that when time_batch_size=1000, the accuracy will be worse and cannot reach the accuracy in the paper.
    I feel that there is something wrong with my running the code step, and I deleted it when time_batch_size=100
    model, optimizer, scheduler = load_checkpoint(model, optimizer, scheduler, pre_model_save_path)
    At time_batch_size=200, I use checkpoint100.pt to load the model to train checkpoint200.pt. And added the code above.
    Can you help me see if there is a problem with the code running steps! At the same time, I set the learning rate to 1e-3, whether the learning rate also affects the result.Thank you for helping me solve these problems, thank you very much!

嗨,我试图运行模型,但我失败了。它提醒说:“c”参数有 16641 个元素,这与第 16384 行大小为 506 的“x”和“y”不一致。你是怎么改变的?

Hello, the post_process function is used to draw pictures, I delete the drawing code can run, but for the results in the paper, I can not run out, looking forward to you can run the results, and share the operation steps.

paulpuren commented 7 months ago

Hello author, I would like to ask why the initialization parameters of the burgers equation are set this way in the def initialize_weights function. Can you tell me how you got it? I found that I replaced some of the other initialization Settings with poor results. Thank you very much for your help!!!!!!

Hi, we think it might be helpful to initialize the network parameters to be close to zeros since there is only physics loss. If there is data loss, other initialization methods might also work. That is an empirical finding.

paulpuren commented 7 months ago

Hello, author. I read your article carefully and then tried to reproduce your code. When I train it with 100 time-step arithmetic examples, the training error is basically around 1e-1, while it can easily reach 1e-5~1e-3 in the results given in your article. can you provide a little advice? Thank you for helping me solve these problems, thank you very much! train loss x=32,y=32

The training loss is still decreasing drastically. Can you train your model to 10k epochs?

paulpuren commented 7 months ago
    Hello author, I have read your paper carefully and also reproduced the code. Following the steps you responded to in the other question, I ran the code. 
    First, I set time_batch_size=100, time_steps=101, and n_iters_adam=5000. I trained this network to get checkpoint100.pt, I tested step 100 with an accuracy of u=0.5% and v=0.8%. But when I went to get checkpoint200.pt, the accuracy was only u=7% and v=3.5%. Is there something wrong with me that leads to the low accuracy? Because I feel that when time_batch_size=1000, the accuracy will be worse and cannot reach the accuracy in the paper.
    I feel that there is something wrong with my running the code step, and I deleted it when time_batch_size=100
    model, optimizer, scheduler = load_checkpoint(model, optimizer, scheduler, pre_model_save_path)
    At time_batch_size=200, I use checkpoint100.pt to load the model to train checkpoint200.pt. And added the code above.
    Can you help me see if there is a problem with the code running steps! At the same time, I set the learning rate to 1e-3, whether the learning rate also affects the result.Thank you for helping me solve these problems, thank you very much!

嗨,我试图运行模型,但我失败了。它提醒说:“c”参数有 16641 个元素,这与第 16384 行大小为 506 的“x”和“y”不一致。你是怎么改变的?

Hello, the post_process function is used to draw pictures, I delete the drawing code can run, but for the results in the paper, I can not run out, looking forward to you can run the results, and share the operation steps.

Sorry for the late reply. Can you make the code work on your side now? If not, can you send more details to my email: ren.pu@northeastern.edu? Thanks.

ZincJT commented 7 months ago
    Hello author, I have read your paper carefully and also reproduced the code. Following the steps you responded to in the other question, I ran the code. 
    First, I set time_batch_size=100, time_steps=101, and n_iters_adam=5000. I trained this network to get checkpoint100.pt, I tested step 100 with an accuracy of u=0.5% and v=0.8%. But when I went to get checkpoint200.pt, the accuracy was only u=7% and v=3.5%. Is there something wrong with me that leads to the low accuracy? Because I feel that when time_batch_size=1000, the accuracy will be worse and cannot reach the accuracy in the paper.
    I feel that there is something wrong with my running the code step, and I deleted it when time_batch_size=100
    model, optimizer, scheduler = load_checkpoint(model, optimizer, scheduler, pre_model_save_path)
    At time_batch_size=200, I use checkpoint100.pt to load the model to train checkpoint200.pt. And added the code above.
    Can you help me see if there is a problem with the code running steps! At the same time, I set the learning rate to 1e-3, whether the learning rate also affects the result.Thank you for helping me solve these problems, thank you very much!

Hi ZincJT, For 2D Burgers' equations, Is the truth reference solution calculated by 'Burgers_2dsolver[HighOrder].py' in the folder Datasets consistent with those in the paper?

It should be consistent, and it should look like the picture in the paper