Visualization Style with appearance of Outliers

It is quite often in many datasets that we find subjects that have outliers. This usually tends to cause the predicted age to be way too high or way too low. In turn, the graphs displayed for age modelling use the min and max of all the age ranges. Hence we end up sometimes with graphs as those attached:

chronological_vs_pred_age_all_all-1 features_vs_age_controls_all

This is kind of good and kind of bad at the same time:

Good because it lets us see that there are outliers in the data. Bad because we can't see the none outliers which is what interests us.

Solutions: Ideally we would want to discard outliers. How can we do this? Well we should at least report somehow that some values are very far from the average (maybe 3SD?) and give the ID so that users can remove them. Alternatively we could set the ranges based on the original ages. However this does not work for visualizing the relationships between features and age.

compneurobilbao / ageml

Visualization Style with appearance of Outliers #60