As most of our best-performing models seem to be suffering from overfitting, finding ways to meaningfully reduce the feature set fed to the model is likely to improve the stability and performance of our best-case models.
To do this, testing common statistics applied across all vertebrae could be valuable, namely:
Maximum: would help detect which patients have a naturally larger/smaller spinal cord diameter relative to each other
Minimum: should help detect the severity of the maximum compression across the spinal cord
Mean: identify relative extent of overall compression in each patient (though this is, in my opinion, tenuous)
STD: represents the frequency and severity of compressions across the entire spinal cord
These are all quite easy to calculate as well; just need to put aside from time for it
Let me know if there are any other statistics worth trying as well; with the current code-base, so long as the statistic is trivial to calculate, so is its evaluation
As most of our best-performing models seem to be suffering from overfitting, finding ways to meaningfully reduce the feature set fed to the model is likely to improve the stability and performance of our best-case models.
To do this, testing common statistics applied across all vertebrae could be valuable, namely:
These are all quite easy to calculate as well; just need to put aside from time for it