Closed geneorama closed 7 years ago
This turned out to be trivial. I already have a variable for last week, and the y value is this week, so this measure was simply last week == this week.
Based on the two weeks in a row number here's the spread of scores:
Here’s the ROC curves for the holdout year 2016
Here is the confusion matrix for the new two weeks in a row measure, using "two weeks in a row" as the target for the model.
r true_pos true_neg false_neg false_pos sensitivity specificity recall precision fmeasure
1: 0.000 116 0 0 866 1.00000000 0.0000000 1.00000000 0.1181263 0.21129326
2: 0.025 116 747 0 119 1.00000000 0.8625866 1.00000000 0.4936170 0.66096866
3: 0.050 116 747 0 119 1.00000000 0.8625866 1.00000000 0.4936170 0.66096866
4: 0.075 116 747 0 119 1.00000000 0.8625866 1.00000000 0.4936170 0.66096866
5: 0.100 116 749 0 117 1.00000000 0.8648961 1.00000000 0.4978541 0.66475645
6: 0.125 116 750 0 116 1.00000000 0.8660508 1.00000000 0.5000000 0.66666667
7: 0.150 116 752 0 114 1.00000000 0.8683603 1.00000000 0.5043478 0.67052023
8: 0.175 116 755 0 111 1.00000000 0.8718245 1.00000000 0.5110132 0.67638484
9: 0.200 116 762 0 104 1.00000000 0.8799076 1.00000000 0.5272727 0.69047619
10: 0.225 115 767 1 99 0.99137931 0.8856813 0.99137931 0.5373832 0.69696970
11: 0.250 113 775 3 91 0.97413793 0.8949192 0.97413793 0.5539216 0.70625000
12: 0.275 112 785 4 81 0.96551724 0.9064665 0.96551724 0.5803109 0.72491909
13: 0.300 111 792 5 74 0.95689655 0.9145497 0.95689655 0.6000000 0.73754153
14: 0.325 105 797 11 69 0.90517241 0.9203233 0.90517241 0.6034483 0.72413793
15: 0.350 98 804 18 62 0.84482759 0.9284065 0.84482759 0.6125000 0.71014493
16: 0.375 89 811 27 55 0.76724138 0.9364896 0.76724138 0.6180556 0.68461538
17: 0.400 79 823 37 43 0.68103448 0.9503464 0.68103448 0.6475410 0.66386555
18: 0.425 74 825 42 41 0.63793103 0.9526559 0.63793103 0.6434783 0.64069264
19: 0.450 69 833 47 33 0.59482759 0.9618938 0.59482759 0.6764706 0.63302752
20: 0.475 61 839 55 27 0.52586207 0.9688222 0.52586207 0.6931818 0.59803922
21: 0.500 42 845 74 21 0.36206897 0.9757506 0.36206897 0.6666667 0.46927374
22: 0.525 35 855 81 11 0.30172414 0.9872979 0.30172414 0.7608696 0.43209877
23: 0.550 27 857 89 9 0.23275862 0.9896074 0.23275862 0.7500000 0.35526316
24: 0.575 21 858 95 8 0.18103448 0.9907621 0.18103448 0.7241379 0.28965517
25: 0.600 18 860 98 6 0.15517241 0.9930716 0.15517241 0.7500000 0.25714286
26: 0.625 13 862 103 4 0.11206897 0.9953811 0.11206897 0.7647059 0.19548872
27: 0.650 8 865 108 1 0.06896552 0.9988453 0.06896552 0.8888889 0.12800000
28: 0.675 5 866 111 0 0.04310345 1.0000000 0.04310345 1.0000000 0.08264463
29: 0.700 1 866 115 0 0.00862069 1.0000000 0.00862069 1.0000000 0.01709402
30: 0.725 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
31: 0.750 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
32: 0.775 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
33: 0.800 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
34: 0.825 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
35: 0.850 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
36: 0.875 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
37: 0.900 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
38: 0.925 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
39: 0.950 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
40: 0.975 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
41: 1.000 0 866 116 0 0.00000000 1.0000000 0.00000000 NaN NaN
That’s not the best cutoff, but we wouldn’t know that until the end of the season. The .2 mark is a sort of “let’s not miss anything” level.
Currently sprays occur if there are two weeks in a row with WNV.
We should add to the model a measure that checks how many times we were right on the second week given that the first week was positive.