webcompat / webcompat.com

Source code for webcompat.com
https://webcompat.com
361 stars 192 forks source link

[ml] rate of issues that machine learning is handling #3236

Open karlcow opened 4 years ago

karlcow commented 4 years ago

Looking at the number of issues which are in needstriage, without action-needsmoderation and anonymous. I only see a couple which have been handled by the machine learning bot.

We need to better understand in which occasion the ML-process kicks in and if it is supposed to label all issues or not. And if it's normal or if it's a series of miss.

Capture d’écran 2020-03-09 à 06 59 47
cipriansv commented 4 years ago

We moderated this morning 144 issues and the ML bot closed about 30 of them.

image

karlcow commented 4 years ago

@cipriansv how does it compare with before ?

cipriansv commented 4 years ago

It took me about 45 minutes to moderate the issues and by the time I was finished with the process, the ML bot did its job.

From our point of view, it seems to be working fine, just as before the incident happened.

karlcow commented 4 years ago

From our point of view, it seems to be working fine, just as before the incident happened.

Two questions here

I have the feeling that all issues should be processed by the ML engine, at least as an advisory mechanism if we do not want to close them automatically for non anonymous. But I'm asking because I don't know yet what was the original expectations/constraints.

miketaylr commented 4 years ago
  • Why all anonymous issues are not being processed by the ML bot (what's the criteria? @miketaylr )

Can you clarify what you mean by not being processed, @karlcow? My understanding is that the bot classifies all issues, and only tags/closes issues that have a 95% confidence level of being invalid. Something with a confidence level lower than that will be left to humans.

(If we find it useful, we could close all issues below 90%)

What was the rationale to not process issues which are non anonymous?

I'm not sure we have a strong rationale, it was just a decision that was made. I think the assumption was that anonymous reports tend to be less valid than authed reports, so let's start there. It would be interesting to run it for all reports, IMO.

karlcow commented 4 years ago

Can you clarify what you mean by not being processed, @karlcow? My understanding is that the bot classifies all issues, and only tags/closes issues that have a 95% confidence level of being invalid. Something with a confidence level lower than that will be left to humans.

@miketaylr in the current design I was not sure if an issue had been processed or not. but you confirmed that is the case. i wonder if a label action-ml-done would be appropriate. Maybe not.

cipriansv commented 4 years ago

We noticed after monitoring the ML-bot activity that, lately, it does not close some issues that are obviously invalid, since they are scam or phishing sites and contain the "scam" or "phishing" words as part of the issue description.

Examples:

miketaylr commented 4 years ago

it does not close some issues that are obviously invalid,

This will probably depend on how the model was trained. It follows a pattern of what was invalid in the past (which may not have had those words or patterns inside of it). I think as we continue to re-train the model, it should bet better in the future.

(This should be possible, we just don't really know how to do it right now. :))