How to judge if the training is convergent

bingsimon commented 7 months ago

Hello, I am learning to use jax-reaxff recently. I don't know how to see if the force field training results in convergence. I have the following three questions, I hope to get your answers:

In the SI of your article, you have convergence plots for three examples from Datasets. Does the ordinate in the figure refer to the 'True total loss' value for each iteration in the output file?
I used the ffield_lit, geo, params, trainset.in files from Datasets/disulfide to run the program and got the convergence as shown in the figure below. It seems not to converge, but I don't know what the problem is. Here are the parameters I used: jaxreaxff --init_FF ffield_lit \ --params params \ --geo geo \ --train_file trainset.in \ --num_e_minim_steps 200 \ --e_minim_LR 1e-3 \ --out_folder ffields \ --save_opt all \ --num_trials 10 \ --num_steps 20 \ --init_FF_type fixed \
Whether there is a clear convergence standard to judge whether the training results are convergent.

I am looking forward to your answer and thank you for your time!

cagrikymk commented 7 months ago

Hello, There is nothing wrong with the way you run the code and yes the figures from the SI report the true loss value.

I recently extensively rewrote the code, so the results may differ from those in the paper. However, you should still be able to achieve similar fitness scores overall.

Assuming you have the most up-to-date code from the master branch, I believe the issue you are encountering is related to adding noise to the parameters when stuck. I have incorporated logic for such scenarios wherein the optimizer becomes trapped in a local minimum and fails to make progress. In such cases, I introduce slight noise to the parameters and continue the optimization process. This might result in fluctuations in the current true loss value, but since I keep track of the best loss encountered during optimization, it should not negatively affect the overall performance of the optimizer (the final true loss value).

If you want to reduce the amount of noise added, you can modify the following line in your local copy: driver.py line: 153

I hand-tuned and adjusted some parameters to simplify overall usage. In the original paper, I set that value to "0.01," whereas now it's "0.04," indicating a more aggressive noise approach. I might make this parameter modifiable through an argument to the driver.

I hope this answers your question, let me know if it is not clear.

bingsimon commented 7 months ago

Thank you very much for your answer.

cagrikymk / JAX-ReaxFF

How to judge if the training is convergent #19