lululxvi / deepxde

A library for scientific machine learning and physics-informed learning
https://deepxde.readthedocs.io
GNU Lesser General Public License v2.1
2.71k stars 752 forks source link

L-BFGS optimizer stops at exactly 30 iterations, no matter the values of its parameters. #1819

Open enm72 opened 2 months ago

enm72 commented 2 months ago

Hello Dr. Lu, I mostly appreciate your work on the development of the DeepXDE library. I have been using the library for some time now for research and lately I found out that the L-BFGS optimizer stops at exactly 30 iterations. I have been using it without an issue until a few weeks ago, when I observed this behaviour. This is quite strange, because I observed this "early stopping" behaviour while using code, for which the L-BFGS was functioning properly, meaning that I was able to use L-BFGS for as many iterations as I wanted. The environment I am using Google Colab. To make sure that I did not brake something in my code, I tried some of the demo code from the DeepXDE documentation, also code from your work on sampling strategies. The result is always the same: ADAM works perfectly for as many iterations as I want, but when the training advances to L-BFGS, the training stops at exactly 30 iterations. I tried tweaking the gtol and the ftol parameters, but no luck. The arithmetic precision has been set to float64 as always. In general, the issue I am facing is that without changing anything in the conditions of the code I am experimenting with, the L-BFGS optimizer stops at 30 iterations. Any advice on this would be very much appreciated. Thank you.

rispolivc commented 2 months ago

I am facing the same issue. My old codes all had improved convergence using L-BFGS after a round of ADAM iterations. Now it does not work anymore. I can't find where something break. I am using Tensorflow backend.

rispolivc commented 2 months ago

Well, I found out that L-BFGS isn't working properly with Tensorflow backend in Google Colab (both tensorflow.compat.v1 and tensorflow v2). But it is working with pytorch and paddle, at least for me. Include this in your code, if you are using pytorch:

import os os.environ['DDE_BACKEND'] = 'pytorch' import deepxde as dde dde.config.set_default_float('float64') import numpy as np import torch

enm72 commented 2 months ago

Hello rispolivc. I truly appreciate your response, when you replied I was encouraged that a solution to this issue was eventually found, so I re-engaged the L-BFGS training in my code. It was indeed a comfort that someone else that was facing the same issue responded, but I was actually expecting that this issue would be encountered on a massive scale. Indeed, I am also using tensorflow.compat.v1 and tensorflow v2. When changing to pytorch, demo code from DeepXDE documentation that engaged the L-BFGS algorithm did work. However, It did not work for all of my production code. I still get the same response, when the training advances to L-BFGS iterations, even after reverting to pytorch: after 27-30 iterations, the loss terms remain unchanged, training stops and execution moves on to the production of plots of the solutions of the problem I am trying to solve. It is really odd. It looks like the L-BFGS is not improving the loss at all, which is totally puzzling. I tried to change several parameters of the L-BFGS which are relevant, but still no luck.

bakhtiyar-k commented 2 months ago

@lululxvi. I also report that the L-BFGS stopped working with the Tensorflow backend

lululxvi commented 1 month ago

@enm72 @rispolivc @bakhtiyar-k Thank you for reporting the issue. I believe pytorch and paddle should work. I guess it is due to the TensorFlow version. Could you let me know your DeepXDE and TensorFlow versions you are using?

bakhtiyar-k commented 1 month ago

DeepXDE v: 1.12.1 Tensorflow v: 2.15.0

enm72 commented 1 month ago

Dr. Lu, the same goes for me too. DeepXDE v.1.12.1 and TensorFlow v.2.15.0.

davidgomes343 commented 6 days ago

I had the same problem. Switching to pytorch solved it