Bug report: fit gets stuck on 100-line data

pochtar commented 1 year ago

Python 3.10.8, Prophet 1.1.2.

Code:

line_p = pd.read_csv("line_p.csv", index_col=0)    
m = Prophet()
m.fit(line_p)

line_p.csv:

,ds,y
0,2003-01-11,0.0
1,2003-01-18,0.0
2,2003-01-25,0.0
3,2003-02-01,0.0
4,2003-02-08,0.25
5,2003-02-15,0.0
6,2003-02-22,0.0
7,2003-03-01,0.0
8,2003-03-08,0.0
9,2003-03-15,0.0
10,2003-03-22,0.0
11,2003-03-29,0.0
12,2003-04-05,0.0
13,2003-04-12,0.0
14,2003-04-19,0.0
15,2003-04-26,0.25
16,2003-05-03,0.25
17,2003-05-10,0.25
18,2003-05-17,0.0
19,2003-05-24,-0.25
20,2003-05-31,0.0
21,2003-06-07,0.0
22,2003-06-14,0.0
23,2003-06-21,0.0
24,2003-06-28,0.0
25,2003-07-05,0.0
26,2003-07-12,0.0
27,2003-07-19,0.0
28,2003-07-26,-0.25
29,2003-08-02,0.0
30,2003-08-09,-0.25
31,2003-08-16,-0.25
32,2003-08-23,-0.25
33,2003-08-30,-0.25
34,2003-09-06,-0.25
35,2003-09-13,0.0
36,2003-09-20,0.0
37,2003-09-27,0.25
38,2003-10-04,0.25
39,2003-10-11,0.0
40,2003-10-18,0.0
41,2003-10-25,0.0
42,2003-11-01,0.0
43,2003-11-08,0.0
44,2003-11-15,0.0
45,2003-11-22,-0.25
46,2003-11-29,-0.25
47,2003-12-06,-0.25
48,2003-12-13,-0.25
49,2003-12-20,0.0
50,2003-12-27,0.0
51,2004-01-03,0.0
52,2004-01-10,0.0
53,2004-01-17,0.0
54,2004-01-24,0.0
55,2004-01-31,0.0
56,2004-02-07,0.0
57,2004-02-14,0.25
58,2004-02-21,0.0
59,2004-02-28,0.0
60,2004-03-06,-0.25
61,2004-03-13,-0.25
62,2004-03-20,0.0
63,2004-03-27,0.0
64,2004-04-03,0.0
65,2004-04-10,0.25
66,2004-04-17,0.0
67,2004-04-24,0.0
68,2004-05-01,0.0
69,2004-05-08,0.0
70,2004-05-15,0.0
71,2004-05-22,0.0
72,2004-05-29,0.0
73,2004-06-05,0.0
74,2004-06-12,0.0
75,2004-06-19,0.0
76,2004-06-26,0.25
77,2004-07-03,0.0
78,2004-07-10,0.25
79,2004-07-17,0.0
80,2004-07-24,0.0
81,2004-07-31,0.0
82,2004-08-07,0.0
83,2004-08-14,0.0
84,2004-08-21,0.0
85,2004-08-28,0.0
86,2004-09-04,0.0
87,2004-09-11,0.0
88,2004-09-18,0.0
89,2004-09-25,0.0
90,2004-10-02,0.0
91,2004-10-09,0.0
92,2004-10-16,-0.25
93,2004-10-23,-0.25
94,2004-10-30,-0.25
95,2004-11-06,-0.25
96,2004-11-13,-0.25
97,2004-11-20,-0.25
98,2004-11-27,0.0
99,2004-12-04,0.0
100,2004-12-11,0.0
101,2004-12-18,0.0
102,2004-12-25,0.0
103,2005-01-01,0.0
104,2005-01-08,0.0
105,2005-01-15,0.0
106,2005-01-22,0.0
107,2005-01-29,0.0
108,2005-02-05,0.0
109,2005-02-12,0.0
110,2005-02-19,-0.25
111,2005-02-26,-0.25
112,2005-03-05,-0.25
113,2005-03-12,0.0

Result: it computes over 5 hours and never finishes.

An interesting observation: if you supply all but the last line (or less), i.e.:

line_p.iloc[:113]

it finishes almost instantly:

But if you try this, it hangs and computes forever: Process params:

pochtar commented 1 year ago

Ignore multiple processes, they seem like leftovers after previously stuck runs (seems like jupyter doesn't have control over them).

pochtar commented 1 year ago

Another observation: m.fit(line_p.iloc[0:113]) works, m.fit(line_p.iloc[0:114]) doesn't work, m.fit(line_p.iloc[1:114]) surprisingly works too, as well as m.fit(line_p.iloc[1:]) and any other number from except for 0.

Is there a max limit of 113 lines I missed somewhere in the docs? My data is weekly so it's about 28.25 months.

pochtar commented 1 year ago

A workaround for those who hit that too: it seems that this 2019 bug https://github.com/facebook/prophet/issues/842 has not been fixed yet.

A temporary solution: use Newton, like this:

m.fit(line_p, algorithm='Newton')

For the maintainers: LBFGS seems unstable, and falling back to Newton doesn't work either. If LBFGS is challenging to fix, maybe it makes sense to make Newton default instead of a non-working fallback, and let whoever needs LBFGS choose it at their risk.

facebook / prophet

Bug report: fit gets stuck on 100-line data #2395