Open ricgu8086 opened 5 years ago
What prompts you to represent a continuous outcome/prediction ($ amount) in terms of a confusion matrix (meant for binary or categorical modeling tasks)? It seems to me the output of a confusion matrix with even tens of different categories represented would be difficult to understand, let alone potentially thousands of categories.
I assume you're trying to understand your model's performance across the entire dollar range, to see where there may be gaps. Have you tried a residual plot (i.e. plotting predicted $ amount on the x-axis, and the error on the y-axis?
I suppose you could try binning your $ amounts to reduce the cardinality in the predictions/actual outcomes but that seems arbitrary and roundabout.
Hi,
I would like to ask if there is a way to provide a precomputed confusion matrix and still using scikit-plot functions for visualization. I have a task where I want to plot 2 types of confusion matrix: one for number of transactions and one for the amount of each transaction ($). In the first case is pretty straightforward, I have ground truth, I have predictions, so just a quick call to
plot_confusion_matrix
and voilá. However, for the second case is not that easy, as some transactions could be in order of 1000$. If the dataset is of millions of dolars, I would need to create an array with a huge size where each element is a single $, its prediction and its ground truth. It is less cumbersome if I compute by myself the confusion matrix and plot it with aseaborn.heatmap
but then the appearance will not be consistent with the other plots.Is this something that can be done? or maybe is it an enhancement suggestion?
Thanks