Improve model transparency

Fairly is only useful if the users feel they can trust the results. This is specially important in corner cases where there are few data points in the training distribution and the model may not have learned the specific context as well. Currently, what should happen in these cases is that the prediction bands widen due to the uncertainty in the training data, either due to high variance or due to low number of data points.

Although the previous already helps users understand how confident is the model of the prediction (higher bands -> more uncertainty) it would still be useful to make the model more transparent. Some possible approaches are listed below:

Histogram Show the histogram for the specific input context below the distribution. This is the most straightforward approach. It's interesting, but I have my doubts it would work due to the low amount of data for several cases. In a way, it actually opposes to the fundamental idea of the modelling which is to trust the generalization capabilities of the model. Using a histogram the information about two input contexts that are exactly the same but only differ in one feature can't be leveraged for one another. Having said that, it should be interesting to test it.

Embedding The idea here would be to learn the 2D embedding of the input space and allow the user to explore it. When hovering a point it would show the corresponding input context. The color corresponds to the salary. Below is a quick poc:

Basically, this would be an unsupervised alternative to the current approach, since it allows the user to look at the raw input space it would be more transparent and allow him to quickly see the most similar users to himself and the corresponding salaries. I believe it can be an interesting auxiliary approach, specially for corner cases.

Calibration Contrary to the previous, these one simply corresponds to continuing the current probabilistic modelling approach but looking more carefully into the model calibration and how it differs across input contexts.

TSFelg / fairly

Improve model transparency #5