Neuroglycerin / hail-seizure

Submission for Kaggle's American Epilepsy Society Seizure Prediction Challenge
Apache License 2.0
2 stars 0 forks source link

Set classifier prior #18

Closed scottclowe closed 9 years ago

scottclowe commented 10 years ago

The best classifier overall AUROC will not have the distributions in the training data, it will have a prior distribution equal to the private test data. However, we don't know the distribution in the private test data, so we have to guess it. The relationship between the proportion of preictal segments in each individual with the overall proportion seems to be independent for train and publictest. Half the time they agree, half the time they disagree (see below).

I think the best prior is either:

Subject preictal proportion compared to overall proportion Train PublicTest Dog_1 < < Dog_2 < > Dog_3 < < Dog_4 > < Dog_5 < = Patient_1 > < Patient_2 > >

This comes from the all Zeros but one session (which is ones) AUROC submissions. Train PublicTest Dog_1 0.47 0.49 Dog_2 0.48 0.56 Dog_3 0.44 0.47 Dog_4 0.52 0.48 Dog_5 0.49 0.50 Patient_1 0.55 0.49 Patient_2 0.55 0.52