non-zero return code in optimizing - R warning

facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

https://facebook.github.io/prophet

MIT License

18.42k stars 4.53k forks source link

non-zero return code in optimizing - R warning #654

Closed joaomicunha closed 5 years ago

joaomicunha commented 6 years ago

What can be the cause of 1: In .local(object, ...) : non-zero return code in optimizing warning in R. It happens with me whilst fitting monthly data.

There are no negative values in the data, no zeros and no breaks. The error seems to cause a model to be fit with a very linear trend only

bletham commented 6 years ago

The optimizer dumps a bunch of information to the screen about its iterations. Could you paste that?

Anyway, it seems that the optimizer is having a hard time converging. The model can be quite underspecified when there aren't many data points, and that can make the posterior surface difficult for the default L-BFGS optimizer. I have found that the Newton optimizer is more robust in these settings (albeit a bit slower, hence it's not the default) so I'd recommend trying that:

m <- prophet(df, algorithm='Newton')

joaomicunha commented 6 years ago

This is the information returned:

Initial log joint probability = -2.21427 Optimization terminated with error: Line search failed to achieve a sufficient decrease, no more progress can be made

It does seem to have to do with the volume of data (even though I'm getting this error training on monthly data with 97 months which should be enough). When expanding the data for training the model converges (in this particular example).

Applying a transformation to y seems to have an effect as well: when using y the same model doesn't converge (even with more data) whilst with log(y) does (the example above was with log). The scale of my y variable is quite high (summary stats of y below):

Min. 1st Qu. Median Mean 3rd Qu. Max. 477730 1005972 1202881 1368569 1511977 3705565

joaomicunha commented 6 years ago

also, different algorithms as optimisers shouldn't give significantly different results right? What are the algorithms that can be passed to prophet(df, algorithm= )?

bletham commented 6 years ago

the algorithm input is passed along to Stan. That, and other kwargs that you could pass along, are described in their documentation: http://mc-stan.org/rstan/reference/stanmodel-method-optimizing.html The options are "LBFGS", "BFGS", "Newton".

Normally the choice of optimizer won't matter because they are optimizing the same thing and will typically end up in the same optimum, however if the posterior surface is badly behaved in some way then the different strategies that the optimizers use to maximize it could end up leading to different places (or one getting stuck while another doesn't) and this will give different model parameters. This isn't typical, although the Python version does have logic to switch to Newton if L-BFGS did not terminate successfully; we should port that same logic to the R version.

bletham commented 5 years ago

Pushed to CRAN