Closed shlima closed 4 years ago
I tried using the dataset you provided by replacing the content of the file 'cases_dataset_2020-04-09.csv' and setting the 'grab_data_from_server' property to 'false' under and the cases model and I did not receive any errors.
Can you please write the steps you took that resulted this error?
Hm, the same with me. But I started to get incorrect results for some countries
CSV file for Germany:
0,0
1,0
2,0
3,0
4,0
5,1
6,4
7,4
8,4
9,5
10,8
11,10
12,12
13,12
14,12
15,12
16,13
17,13
18,14
19,14
20,16
21,16
22,16
23,16
24,16
25,16
26,16
27,16
28,16
29,16
30,16
31,16
32,16
33,16
34,17
35,27
36,46
37,48
38,79
39,130
40,159
41,196
42,262
43,482
44,670
45,799
46,1040
47,1176
48,1457
49,1908
50,2078
51,3675
52,4585
53,5795
54,7272
55,9257
56,12327
57,15320
58,19848
59,22213
60,24873
61,29056
62,32986
63,37323
64,43938
65,50871
66,57695
67,62095
68,66885
69,71808
70,77872
71,84794
72,91159
73,96092
74,100123
75,103374
76,107663
77,113296
78,118181
Forecast:
The forecast for Cases in the following 30 days is:
1: 116149
2: 115391
3: 112786
4: 108023
5: 100763
6: 90631
7: 77222
8: 60091
9: 38757
10: 12698
11: -18650
12: -55897
13: -99700
14: -150765
15: -209853
16: -277776
17: -355408
18: -443681
19: -543591
20: -656200
21: -782641
22: -924118
23: -1081912
24: -1257382
25: -1451972
26: -1667208
27: -1904711
28: -2166191
29: -2453457
30: -2768420
Config file:
{
"models": [
{
"model_name": "Cases",
"polynomial_degree": 7,
"datagrabber_class": "CasesDataGrabber",
"grab_data_from_server": false,
"offline_dataset_date": "0000-00-00",
"days_to_predict": 30
},
{
"model_name": "Deaths",
"polynomial_degree": 7,
"datagrabber_class": "DeathsDataGrabber",
"grab_data_from_server": false,
"offline_dataset_date": "0000-00-00",
"days_to_predict": 30
}
]
}
Chart
I see, you are getting these incorrect results because the polynomial degree of your model is too high for your data.
In order to get better results, you need to tweak the "polynomial_degree" hyper-parameter in the config file (this is a trial and error process). For starter, try a polynomial degree of 2, 3 or 4 instead of 7. According to the data visualization you have provided, a polynomial degree of 2 or 3 should fit quite well.
@eladcn thank you, your suggestion works.
Now I have 2 cases:
Estnoia with polynomial_degree of 3:
Estnoia with polynomial_degree of 5:
It seems that the second chart for Estonia is more believable.
Can you suggest me a pattern by which I can set the polynomial_degree to the correct value for each country ?
Dataset for Estonia:
0,0
1,0
2,0
3,0
4,0
5,0
6,0
7,0
8,0
9,0
10,0
11,0
12,0
13,0
14,0
15,0
16,0
17,0
18,0
19,0
20,0
21,0
22,0
23,0
24,0
25,0
26,0
27,0
28,0
29,0
30,0
31,0
32,0
33,0
34,0
35,0
36,1
37,1
38,1
39,1
40,1
41,2
42,2
43,3
44,10
45,10
46,10
47,10
48,12
49,16
50,16
51,79
52,115
53,171
54,205
55,225
56,258
57,267
58,283
59,306
60,326
61,352
62,369
63,404
64,538
65,575
66,645
67,679
68,715
69,745
70,779
71,858
72,961
73,1039
74,1097
75,1108
76,1149
77,1185
78,1207
Unfortunately there isn't really a pattern for this, but I can give you a few tips:
I will consider adding neural network support to the this project in the coming days - using neural networks might be better for some scenarios.
Thank you very much for your support
CSV (Germany)