facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.26k stars 4.51k forks source link

Fit error "B[1] is -nan, but most not be nan!" #1032

Open bletham opened 5 years ago

bletham commented 5 years ago

From @humphreyapplebee:

I was able to find another example of a series that exhibits this behaviour:

from fbprophet import Prophet
import pandas as pd

y = [800, 5700, 8200, 5413, 19035, 14841, 12935, 18518, 28402, 73898, 125483, 11353, 135, 650, 50, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 0, 0, 502, 377, 125, 984, 851, 127, 2914, 1073, 1339, 3468, 4813, 203, 0, 0, 0]
periods = len(y)
dates = pd.date_range(start='2018-02-05', freq='W-MON', periods=periods)
floor = [0] * periods
cap = [484.6442413330078] * periods
df = pd.DataFrame({
    "ds": dates,
    "y": y,
    "floor": floor,
    "cap": cap
})
model = Prophet(
    daily_seasonality=False,
    weekly_seasonality=True,
    yearly_seasonality=True,
    seasonality_mode="multiplicative",
    changepoint_prior_scale=0.05,
    growth="logistic"
)

With LBGFS (i.e. model.fit(df, algorithm='LBFGS')), the training stalls as described above. Interestingly, training no longer stalls when I modify the number of changepoints (either adding more or less) or perturb the training data even very slightly (e.g. by dropping a significant figure from one of the dates).

With Newton (i.e. model.fit(df, algorithm='Newton')), I receive the following exception:

WARNING:fbprophet:Optimization terminated abnormally. Falling back to Newton.
...
vendor/local/lib/python2.7/site-packages/pystan/model.pyc in optimizing(self, data, seed, init, sample_file, algorithm, verbose, as_vector, **kwargs)
    548         stan_args = pystan.misc._get_valid_stan_args(stan_args)
    549 
--> 550         ret, sample = fit._call_sampler(stan_args)
    551         pars = pystan.misc._par_vector2dict(sample['par'], m_pars, p_dims)
    552         if not as_vector:

stanfit4anon_model_861b75c6337e237650a61ae58c4385ef_8369643458939391769.pyx in stanfit4anon_model_861b75c6337e237650a61ae58c4385ef_8369643458939391769.StanFit4Model._call_sampler()

stanfit4anon_model_861b75c6337e237650a61ae58c4385ef_8369643458939391769.pyx in stanfit4anon_model_861b75c6337e237650a61ae58c4385ef_8369643458939391769._call_sampler()

RuntimeError: Exception: Exception: multiply: B[1] is -nan, but must not be nan! 

Originally posted by @humphreyapplebee in https://github.com/facebook/prophet/issues/842#issuecomment-504472349

daikonradish commented 5 years ago

My money is on that long sequence of zeros. These tend to make STAN’s estimates numerically unstable and subject to small changes in values.

If someone is passing in a lot of zeros it’s likely because they’re missing data.

filpia commented 5 years ago

I just ran into the same error and found a fix/workaround. My target y-values are generally between 1 and 20 in any given time period and I sent them to the log-scale to achieve data compression. When in the log scale, the handful of values that were < 1 are now negative. After pounding my head a bit, I added 1 to my initial y's before applying the log so that all transformed values, on which the model is trained, are always positive. Poof! Model was fit and I undid the transform at the end with np.exp(yhat)-1.

I know this is not a fix for the numerical stability issue at hand in pystan but I thought it would be useful if you really need a workaround.

caseyjconger commented 4 years ago

Also just got this issue. Sorry, don't have time to write out a description, but I've been getting this as well.

JJ commented 4 years ago

Here too. I changed "floor" from 3 to 1 and it worked.

lazaronixon commented 4 years ago

Same here with growth="logistic", removed "yearly_seasonality=False, weekly_seasonality=False, daily_seasonality=False" no errors, but yhat is NaN, it happens because my carrying capacity is so high... any problem with it?

viktoria-ivan commented 2 years ago

Thanks for the suggestion @JJ, changing the floor from 0 to -1 fixed this for me!

kandeldeepak46 commented 2 years ago

do df['y'] = np.log(df['y'] + 1)

and the calculate the log inverse at the end and subtract 1 from this....

AlexSiormpas commented 1 year ago

Hi all - I swapped the floor from 0 to 0.01 and managed to by-pass the error.

tasdemirbahadir commented 1 year ago

I was using

"yearly_seasonality": False,
"weekly_seasonality": False,
"daily_seasonality": False,

for yearly forecast and faced this error converted the values to

"yearly_seasonality": False,
"weekly_seasonality": True,
"daily_seasonality": True,

And fixed my problem