For the L-BFGS optimizer, parameter "maxiter" should not be used to define the number of loops on data

Discussed in https://github.com/lululxvi/deepxde/discussions/1665

^{Originally posted by **rapham** March 4, 2024} Hello everyone, I am using the L-BFGS optimizer to solve a forward PDE problem. I am not very familiar with this algorithm, but I wondered what the `maxiter` parameter really represents. Indeed, I found some differences between the meaning it has in deepxde source code and in the backends documentations: - [deepxde source code](https://github.com/lululxvi/deepxde/blob/c28e2c4edc798e97f5e3a477345a05fa7b642f17/deepxde/model.py#L757): it controls the number of times the method optim.step() will be called: ``` def _train_pytorch_lbfgs(self): prev_n_iter = 0 while prev_n_iter < optimizers.LBFGS_options["maxiter"]: self.callbacks.on_epoch_begin() self.callbacks.on_batch_begin() self.train_state.set_data_train( *self.data.train_next_batch(self.batch_size) ) self._train_step( self.train_state.X_train, self.train_state.y_train, self.train_state.train_aux_vars, ) n_iter = self.opt.state_dict()["state"][0]["n_iter"] if prev_n_iter == n_iter: # Converged break self.train_state.epoch += n_iter - prev_n_iter self.train_state.step += n_iter - prev_n_iter prev_n_iter = n_iter self._test() self.callbacks.on_batch_end() self.callbacks.on_epoch_end() if self.stop_training: break ``` - [Pytorch ](https://pytorch.org/docs/stable/generated/torch.optim.LBFGS.html)documentation: maximal number of iterations per optimization step - [Tensorflow ](https://www.tensorflow.org/probability/api_docs/python/tfp/optimizer/lbfgs_minimize)documentation: the maximum number of iterations for L-BFGS updates. - [Paddle ](https://www.paddlepaddle.org.cn/documentation/docs/en/api/paddle/incubate/optimizer/functional/minimize_lbfgs_en.html)documentation: the maximum number of minimization iterations It seems to me in deepxde `maxiter` is used to control the number of iterations typically performed on the data (per batch or per epoch) whereas for the backends, it refers to the number of **iterations of the L-BFGS algorithm** to perform for one step of the optimization. What do you think ? Thanks in advance for your help !

lululxvi / deepxde

For the L-BFGS optimizer, parameter "maxiter" should not be used to define the number of loops on data #1691

Discussed in https://github.com/lululxvi/deepxde/discussions/1665