facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.51k stars 4.53k forks source link

1.1.5 Breaks working code - "Error during optimization!" #2513

Open John-Miller12 opened 1 year ago

John-Miller12 commented 1 year ago

Hello,

Thank you for your support of this project.

Environment:

The python code:

temp = aaaed[:'2020-01-31'].reset_index()
m = Prophet(seasonality_mode='multiplicative')
m.fit(temp)
future = m.make_future_dataframe(periods=731)
forecast = m.predict(future)

now fails with the error text below.

Simple, toy data also fails with default settings. so far, m.fit() has not worked for me at all in 1.1.5.

m.preprocess does work though (lol!)

rollback to 1.1.4 restores code function.

03:29:57 - cmdstanpy - INFO - Chain [1] start processing
03:29:57 - cmdstanpy - INFO - Chain [1] done processing
03:29:57 - cmdstanpy - ERROR - Chain [1] error: terminated by signal 11 Unknown error -11
Optimization terminated abnormally. Falling back to Newton.
03:29:57 - cmdstanpy - INFO - Chain [1] start processing
03:29:57 - cmdstanpy - INFO - Chain [1] done processing
03:29:57 - cmdstanpy - ERROR - Chain [1] error: terminated by signal 11 Unknown error -11
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
File /opt/conda/lib/python3.11/site-packages/prophet/models.py:121, in CmdStanPyBackend.fit(self, stan_init, stan_data, **kwargs)
    120 try:
--> 121     self.stan_fit = self.model.optimize(**args)
    122 except RuntimeError as e:
    123     # Fall back on Newton

File /opt/conda/lib/python3.11/site-packages/cmdstanpy/model.py:738, in CmdStanModel.optimize(self, data, seed, inits, output_dir, sig_figs, save_profile, algorithm, init_alpha, tol_obj, tol_rel_obj, tol_grad, tol_rel_grad, tol_param, history_size, iter, save_iterations, require_converged, show_console, refresh, time_fmt, timeout)
    737     else:
--> 738         raise RuntimeError(msg)
    739 mle = CmdStanMLE(runset)

RuntimeError: Error during optimization! Command '/opt/conda/lib/python3.11/site-packages/prophet/stan_model/prophet_model.bin random seed=13195 data file=/tmp/tmpaajvr7zp/8h9qf2n_.json init=/tmp/tmpaajvr7zp/_qv49cs9.json output file=/tmp/tmpaajvr7zp/prophet_modelsrj0hzs5/prophet_model-20231021032957.csv method=optimize algorithm=lbfgs iter=10000' failed: 

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
Cell In[10], line 4
      1 temp = aaaed[:'2020-01-31'].reset_index().copy()
      3 m = Prophet(seasonality_mode='multiplicative')
----> 4 m.fit(temp)
      5 future = m.make_future_dataframe(periods=731)
      6 forecast = m.predict(future)

File /opt/conda/lib/python3.11/site-packages/prophet/forecaster.py:1232, in Prophet.fit(self, df, **kwargs)
   1230     self.params = self.stan_backend.sampling(stan_init, dat, self.mcmc_samples, **kwargs)
   1231 else:
-> 1232     self.params = self.stan_backend.fit(stan_init, dat, **kwargs)
   1234 self.stan_fit = self.stan_backend.stan_fit
   1235 # If no changepoints were requested, replace delta with 0s

File /opt/conda/lib/python3.11/site-packages/prophet/models.py:128, in CmdStanPyBackend.fit(self, stan_init, stan_data, **kwargs)
    126     logger.warning('Optimization terminated abnormally. Falling back to Newton.')
    127     args['algorithm'] = 'Newton'
--> 128     self.stan_fit = self.model.optimize(**args)
    129 params = self.stan_to_dict_numpy(
    130     self.stan_fit.column_names, self.stan_fit.optimized_params_np)
    131 for par in params:

File /opt/conda/lib/python3.11/site-packages/cmdstanpy/model.py:738, in CmdStanModel.optimize(self, data, seed, inits, output_dir, sig_figs, save_profile, algorithm, init_alpha, tol_obj, tol_rel_obj, tol_grad, tol_rel_grad, tol_param, history_size, iter, save_iterations, require_converged, show_console, refresh, time_fmt, timeout)
    736         get_logger().warning(msg)
    737     else:
--> 738         raise RuntimeError(msg)
    739 mle = CmdStanMLE(runset)
    740 return mle

RuntimeError: Error during optimization! Command '/opt/conda/lib/python3.11/site-packages/prophet/stan_model/prophet_model.bin random seed=73218 data file=/tmp/tmpaajvr7zp/y6ybf0ok.json init=/tmp/tmpaajvr7zp/1dlowjn8.json output file=/tmp/tmpaajvr7zp/prophet_modelfkn1kqc4/prophet_model-20231021032957.csv method=optimize algorithm=newton iter=10000' failed: 
imad24 commented 1 year ago

Looks like the same issue as #2354 and #2456 However it's been around since version 1.1.2/1.1.3/1.1.4 and still not fixed in 1.1.5

It is still not clear in what conditions this problem occurs.

WardBrian commented 1 year ago

Did you install prophet via pip or conda?

Can you run (from a terminal with the conda environment active) /opt/conda/lib/python3.11/site-packages/prophet/stan_model/prophet_model.bin info?

imad24 commented 1 year ago

Did you install prophet via pip or conda?

Can you run (from a terminal with the conda environment active) /opt/conda/lib/python3.11/site-packages/prophet/stan_model/prophet_model.bin info?

I was having the same problem on Docker WSL/Linux (Debian 10) but the problem seems to be fixed in version 1.1.5 It still occurs in versions 1.1.2/1.1.3/1.1.4

When I run the info command you provided on the failing versions it displays this: Segmentation fault

The complete error message is: Error during optimization! Command '/usr/local/lib/python3.8/site-packages/prophet/stan_model/prophet_model.bin random seed=38595 data file=/tmp/tmpf4xp_cv9/kg2vqwoc.json init=/tmp/tmpf4xp_cv9/vgrs2oes.json output file=/tmp/tmpf4xp_cv9/prophet_modelh3mmnubf/prophet_model-20231025093253.csv method=optimize algorithm=newton iter=10000' failed:

dsgkirkby commented 1 year ago

I've also run into this; 1.1.4 works but 1.1.5 frequently fails with this error. This is also on an M2 mac running macos Sonoma, however our python version is 3.9. The output of the above command is:

/Users/dylan/Library/Caches/pypoetry/virtualenvs/{my-virtualenv}--py3.9/lib/python3.9/site-packages/prophet/stan_model/prophet_model.bin info
[1]    832 killed      info
John-Miller12 commented 1 year ago

@WardBrian pardon the delay, i needed to focus on a production ask with a rolled back 1.1.4. I've recreated the issue with 1.1.5 in a new Docker container, same set up as above.

Did you install prophet via pip or conda?

pip install prophet from Jupyter terminal. We migrated production forecasts to Snowflake, so this is the only environment with prophet installed at the moment.

Can you run (from a terminal with the conda environment active) /opt/conda/lib/python3.11/site-packages/prophet/stan_model/prophet_model.bin info?

This returned Segmentation fault

WardBrian commented 1 year ago

My guess is that this won't give us much that is useful, but could you try running the prophet_model.bin info command under a debugger (gdb/lldb)?

I unfortunately do not have access to an M2 myself to try things, but if you can share your docker setup I might be able to borrow one to try reproducing

dsgkirkby commented 8 months ago

I didn't run it under a debugger, but I did check the output of Mac's crash logs and it appears that the OS is killing the program due to an invalid code signature.

-------------------------------------
Translated Report (Full Report Below)
-------------------------------------

Incident Identifier: B6624FA4-6828-45BE-A99A-8DDE4E0F309B
CrashReporter Key:   4C5F32E5-FA87-8E94-1157-798E89D25E7B
Hardware Model:      Mac14,10
Process:             prophet_model.bin [4654]
Path:                /Users/USER/Library/Caches/*/prophet_model.bin
Identifier:          prophet_model.bin
Version:             ???
Code Type:           ARM-64 (Native)
Role:                Unspecified
Parent Process:      python [4563]
Coalition:           com.jetbrains.intellij [1046]
Responsible Process: idea [97309]

Date/Time:           2024-02-29 14:20:38.3278 -0800
Launch Time:         2024-02-29 14:20:38.3034 -0800
OS Version:          macOS 14.3.1 (23D60)
Release Type:        User
Report Version:      104

Exception Type:  EXC_BAD_ACCESS (SIGKILL (Code Signature Invalid))
Exception Subtype: UNKNOWN_0x32 at 0x0000000102264000
Exception Codes: 0x0000000000000032, 0x0000000102264000
VM Region Info: 0x102264000 is in 0x102264000-0x1023f0000;  bytes after start: 0  bytes before end: 1622015
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  __TEXT                      102264000-1023f0000    [ 1584K] r-x/r-x SM=COW  
      __DATA_CONST                1023f0000-1023fc000    [   48K] rw-/rw- SM=COW  
Termination Reason: CODESIGNING 2 Invalid Page

Triggered by Thread:  0

Thread 0 Crashed:
0                                          0x1028ba204 dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 52
1                                          0x1028bc2ac dyld3::MachOFile::forEachSupportedPlatform(void (dyld3::Platform, unsigned int, unsigned int) block_pointer) const + 160
2                                          0x1029122e4 dyld3::MachOFile::isBuiltForSimulator() const + 124
3                                          0x1028bdb88 start + 992

Thread 0 crashed with ARM Thread State (64-bit):
    x0: 0x0000000102264000   x1: 0x000000016db96168   x2: 0x000000016db96110   x3: 0x00000001028b9e43
    x4: 0x0000000000000070   x5: 0x0000000000000073   x6: 0x0000000000000000   x7: 0x0000000000000ca0
    x8: 0x000000016db96148   x9: 0x00000001029535f8  x10: 0x000000010293b000  x11: 0x00000001029487af
   x12: 0x0000000000000065  x13: 0x0000000000000073  x14: 0x0000000000058a70  x15: 0x0000000000000000
   x16: 0x00000001028bc34c  x17: 0x6ae100016db96110  x18: 0x0000000000000000  x19: 0x000000016db96168
   x20: 0x0000000102264000  x21: 0x000000016db96110  x22: 0x00000001028b8000  x23: 0x000000016db962c8
   x24: 0x000000016db962a0  x25: 0x0000000000000000  x26: 0x0000000000000000  x27: 0x0000000000000000
   x28: 0x0000000000000000   fp: 0x000000016db96100   lr: 0x90460001028bc2ac
    sp: 0x000000016db96070   pc: 0x00000001028ba204 cpsr: 0x80001000
   far: 0x0000000102264000  esr: 0x92000007 (Data Abort) byte read Translation fault

Binary Images:
       0x1028b8000 -        0x10294ffff  (*) <50746901-db0e-39a0-b391-baaa6b82ad0f> ???
       0x102264000 -        0x1023effff  (*) <8a46dc46-3002-30b7-b441-1e905220a3ed> ???
               0x0 - 0xffffffffffffffff ??? (*) <00000000-0000-0000-0000-000000000000> ???

Error Formulating Crash Report:
dyld_process_snapshot_get_shared_cache failed

EOF
WardBrian commented 8 months ago

I've seen reports here and there of specific binary signature problems but they seem to be very version-specific in Apple's linker. We've unfortunately not been able to re-create them when we have tried before, but it's something I've seen other projects have complaints with the ARM macs about as well

dsgkirkby commented 8 months ago

I was able to resolve this by running:

codesign --force -s - path/to/prophet_model.bin

(credit to https://github.com/orgs/nodejs/discussions/46442#discussioncomment-4829036)