Compare model performance on extreme-value patients

finncatling / lap-risk

Uncertainty-aware mortality risk modelling in emergency laparotomy, using data from the NELA.

MIT License

5 stars 0 forks source link

Compare model performance on extreme-value patients #74

Closed finncatling closed 3 years ago

finncatling commented 3 years ago

The current model applies quite conservative winsorization thresholds, whereas the novel model uses liberal ones. Hence, the novel model learns spline functions with wide domains. Data are sparse at the extremes of these domains, but the obtained fits for the mortality model splines mostly look reasonable at these extremes. One selling point of the novel model is therefore that it might make better predictions for patients with these extreme input values. We could test this by comparing model performance on an 'extreme patient' subset of the test folds

finncatling commented 3 years ago

Could try https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html#sklearn.ensemble.IsolationForest to obtain each patient's multivariate outlier score

finncatling commented 3 years ago

IsolationForest should be fit on each current-model train fold, then used to obtain anomaly scores for the cases in the corresponding test fold. The features input to IsolationForest should be the intersection of the current and novel model features.

finncatling commented 3 years ago

Don't winsorize input features

finncatling commented 3 years ago

@JMathiszig-Lee The main theme of manuscript 1 is uncertainty quantification, which this issue isn't directly related to. Therefore perhaps we should deprioritise it in advance of completing the first manuscript?

finncatling commented 3 years ago

Closing for now as we have no plans to pursue this in the near future