ageron / handson-ml

⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.
Apache License 2.0
25.2k stars 12.91k forks source link

Chap3: How do i select a threshold from the precision and recall vs threshold curve #473

Open FritzPeleke opened 5 years ago

FritzPeleke commented 5 years ago

Hi Ageron, I have another question although the last is still pending. when I look at the precision versus recall curve. I can see that if my focus is on precision then a precision of 90% will mean a slap to recall at 60%. How do I use this information on the precision and recall versus threshold curve to select a threshold for fine-tuning?

ageron commented 5 years ago

Hi @FritzPeleke , Thanks for your question. The optimal precision/recall depends on your task. For example, if you're building an intruder detection system, you will want to catch as many intrusions as possible (high recall). In this case, you can lower the threshold a lot, which will increase recall (less false negatives, where an intruder gets in undetected) and decrease precision (more false positives, where the alarm goes off even though there's no intrusion). However, if you decrease the threshold too much, you will start to get many false positives. You will have to decide how many false positives per day you can tolerate. If a false positive just means that a security guard will need to look at a screen for a few seconds, then perhaps you can tolerate several false positives a day. But if it means waking up someone and getting them to travel 20km, then not so much. This is just an example, you can easily imagine other tasks where precision is more important than recall (e.g., selecting videos that are safe for children to watch). Hope this helps.