Open dayyass opened 3 years ago
We already have the PrecisionRecallDisplay
. One missing feature there would be the F1 isoline as we plot it in the final figure in the following example:
We already have the
PrecisionRecallDisplay
. One missing feature there would be the F1 isoline as we plot it in the final figure in the following example:
That is not the same, because my implementation has probability thresholds as x-axis.
That is not the same, because my implementation has probability thresholds as x-axis.
Yes but I am not sure this would be the best way to represent the precision-recall metrics. You don't get a sense of the average precision in this way of plotting.
That is not the same, because my implementation has probability thresholds as x-axis.
Yes but I am not sure this would be the best way to represent the precision-recall metrics. You don't get a sense of the average precision in this way of plotting.
You are right, but this plot is for choosing probability thresholds for better model. That plot gives an understanding of what threshold gives a certain ratio of precision and recall.
That plot gives an understanding of what threshold gives a certain ratio of precision and recall.
It is true that we don't have any information regarding the threshold and it would be nice to get it on the current type of Display (ROC curve and precision-recall curve). I could imagine requesting displaying the thresholds as annotated markers on the curves. Since it could become crowded, we could accept an API that (i) do not display the marker, (ii) display all thresholds, (iii) display only a given number of thresholds.
This would also motivate some example/analysis when introducing the meta-estimator developed in: https://github.com/scikit-learn/scikit-learn/pull/16525
That plot gives an understanding of what threshold gives a certain ratio of precision and recall.
It is true that we don't have any information regarding the threshold and it would be nice to get it on the current type of Display (ROC curve and precision-recall curve). I could imagine requesting displaying the thresholds as annotated markers on the curves. Since it could become crowded, we could accept an API that (i) do not display the marker, (ii) display all thresholds, (iii) display only a given number of thresholds.
So for PR-curve we preserve recall as x-axis and precision as y-axis, but add/plot thresholds on the curve?
Somehow, with the following spirit:
But we should not plot all markers and we need to have a nice way to place the annotation. We could even think that to make the annotation of different colours to know if this is above the threshold used to do the argmax (e.g. 0.5 for predict_proba
and 0 for decision_function
).
Somehow, with the following spirit:
But we should not plot all markers and we need to have a nice way to place the annotation. We could even think that to make the annotation of different colours to know if this is above the threshold used to do the argmax (e.g. 0.5 for
predict_proba
and 0 fordecision_function
).
This is nice idea and I am in favor of realizing it, but I think that this plot does not give a complete understanding of probability thresholds behavior compared to the approach I suggested.
When there is a choice of a threshold, lot's of my friends and colleagues plot threshold as x-axis and metrics (precision, recall, f1-score) as y-axis.
Somehow, with the following spirit:
But we should not plot all markers and we need to have a nice way to place the annotation. We could even think that to make the annotation of different colours to know if this is above the threshold used to do the argmax (e.g. 0.5 for
predict_proba
and 0 fordecision_function
).
@glemaitre, 1) could you send your implementation of the plot, so that I don't have to implement it from scratch? 2) could you review my PR with the implementation of my plot?
could you send your implementation of the plot, so that I don't have to implement it from scratch?
It was a dirty hack to make the plot. I did not keep the changes.
could you review my PR with the implementation of my plot?
I am not convinced that I would like this type of plotting in scikit-learn. I would prefer to improve the current plotting that we have. I would think that adding an option to compute the precision and recall gain as proposed in (http://people.cs.bris.ac.uk/~flach/PRGcurves/PRcurves_ISL.pdf) would be better.
However, this is only my opinion so I would like to hear other core devs regarding the matter, notably @amueller that probably looked at these kinds of plots more than me. Maybe @adrinjalali @thomasjpfan @NicolasHug have some opinions as well.
I'm quite in favor of one or more plots which investigate the threshold. I'd agree with @dayyass that the two types of plots discussed here are very different in nature and in what they convey. I personally understand the proposed plot in the OP much better than a precision recall curve, and that's probably because I have never ended up dealing with precision recall curves much in my career, whereas I've worked on finding thresholds for my models in a few instances.
In terms of API, I'd be more happy with something MetricThresholdCurve
which can accept different metrics (precision, recall, f1, ...) and plot it against the thresholds. We can then use it 3 times to achieve what the OP suggests.
I'm quite in favor of one or more plots which investigate the threshold. I'd agree with @dayyass that the two types of plots discussed here are very different in nature and in what they convey. I personally understand the proposed plot in the OP much better than a precision recall curve, and that's probably because I have never ended up dealing with precision recall curves much in my career, whereas I've worked on finding thresholds for my models in a few instances.
In terms of API, I'd be more happy with something
MetricThresholdCurve
which can accept different metrics (precision, recall, f1, ...) and plot it against the thresholds. We can then use it 3 times to achieve what the OP suggests.
I agree with the proposed API - I will redo the current implementation.
Hi @dayyass, are you still working on this feature? I would love to do it if you don't mind!
Hi @vitaliset! I don't have enough time to do it right now, so sure you can do it by yourself, it would be great!
Hello @dayyass and @adrinjalali! I first implemented the "base" function to build the Display later. I would appreciate it if you could take a quick look at the PR to see if I got the idea right. :)
I didn't know the existence of the sklearn.model_selection.validation_curve
. It may already solve this issue allied with PR #16525 if the threshold can be passed (which could be interpreted as a hyperparameter search).
It would be less straightforward than using the curve implemented in #25639, but it has the tradeoff of less code to handle. An example or user guide would be handy in this case. I don't know, maybe this separate curve is more intuitive to use, and we want to go forward with it.
What do you think, @glemaitre? As you are building #16525, I think you have a broader view than I do on this.
@dayyass . Thanks for your issue, I was looking for the same feature.
Would the following implementation solve your request?
@FernandoCarazoMelo, looks kinda similar of what I wanted to do
Describe the workflow you want to enable
Working with binary classifiers I often, in addition to PR-curve and ROC-curve, need Precision / Recall / F1 (y-axis) for probability thresholds (x-axis).
Describe your proposed solution
Describe alternatives you've considered, if relevant
No response
Additional context
No response