finncatling / lap-risk

Uncertainty-aware mortality risk modelling in emergency laparotomy, using data from the NELA.
MIT License
5 stars 0 forks source link

LactateAlbuminImputer._get_features_where_lacalb_missing & ._get_features_where_lacalb_observed #90

Closed finncatling closed 3 years ago

finncatling commented 3 years ago

Both of these functions are supported to drop the mortality labels (self.target_variable_name column) from the return features, but misuse pandas.DataFrame.drop() as an in place method as follows:

features.drop(self.target_variable_name, axis=1)

Instead of correctly using it in a reassignment operation:

features = features.drop(self.target_variable_name, axis=1)

This leaves the mortality labels in the data passed to the lactate and albumin imputers when imputing new values (this bug doesn't impact testing). Is this the reason that the lactate / albumin imputation model evaluation scores are so poor?

finncatling commented 3 years ago

Same mistake appears in LactateAlbuminImputer._fit_combine_gams() which appears to negate the error above! So the model evaluation may not be affected. Still best to fix this.