Charcoal-SE / SmokeDetector

Headless chatbot that detects spam and posts links to it in chatrooms for quick deletion.
https://metasmoke.erwaysoftware.com
Apache License 2.0
464 stars 175 forks source link

We've lost autoflagging accuracy lately #1779

Closed AWegnerGitHub closed 4 years ago

AWegnerGitHub commented 6 years ago

From the recent meta post:

Again, this is very rare: over the past year, we've flagged 66 posts that shouldn't have been, compared to 29 592 spam posts

This was posted 15 days ago.

A search, now, shows that there are 81 false positive autoflagged posts. Each of these has multiple flags.

Additionally, one of our users was recently flag banned on Stack Overflow for having a number of inaccurate autoflags.

I don't want to over-react to a single unlucky user, but going from 66 to 81 false positive autoflags in 15 days seems like a huge uptick. We should focus on that and how we can readjust to stop the uptick.

quartata commented 6 years ago

The fundamental problem with autoflagging is that it overvalues heavily correlated reasons, since it assumes every reason is independent. A great example of this is "Bad keyword with email in answer" and "Email in answer." If you already have "Bad keyword with email in answer", "Email in answer" gives almost no new information -- and yet, it nearly doubles the reason weight.

Dvorak proposed using a neural network to a get an autoflagging score, which I have been trying to experiment with. While that might sound like overkill, I actually think it's a very good solution -- a small net (2 layers, tops) should be able to assign consistent weights to each reason and (more importantly) infer about combinations of reasons better. Unfortunately, coming up with an appropriate loss function for training such a net is somewhat difficult since we value precision (accuracy on what it catches) over recall (how many TPs it catches). I haven't worked on it in several weeks but I think there's definitely potential.

quartata commented 6 years ago

I was building the net in Python using Keras, then exporting it to a Tensorflow graph for use with tensorflow.rb.

Undo1 commented 6 years ago

@quartata Is the code available somewhere? I'd like to take a look sometime.

normalhuman commented 6 years ago

Three false autoflags in 24 hours.

Two simple measures that could be taken without waiting for a network to be trained (about which I'm mildly skeptical, given how sporadic false positives are).

  1. Group all reasons with "keyword": they should contribute the maximum of their weight, not the sum.
  2. Group all reasons with "website": they should contribute the maximum of their weight, not the sum.

By grouping I mean the autoflagging algorithm only; the separate reasons should still exist. The idea is not to count "bad keyword in title" + "bad keyword in body" twice, or "blacklisted website in body" + "pattern-matching website in body", etc.

RobCoding commented 5 years ago

I found an Autoflag that I felt was inaccurate, while looking for duplicates I found this open issue.

The MetaSmoke entry says (redacted):

f*ing read the question before jumping up on it with wrong answers. idiot

This was marked as "spam" on the site's flags, automatically cast on my behalf. I was concerned that it wasn't flagged correctly and thought R/A would be more accurate. I used the site's flag dialog to alter the flag, first by retracting the spam flag; that's when an odd bug reared its head.

Retracting the spam flag alone retracted an invisible R/A flag (there were two flags simultaneously, one not shown), this prohibited me from changing to R/A. I ended up using feedback in the chat room.

screenshot_20190121-152950_firefox beta

It's unusual (unintuitive) that there are multiple flags on the site and the visible one doesn't seem the best choice (and the other isn't visible). To reproduce the above manually (without MS's help), to cause two flags appear to have been made, I would have had to have flagged R/A, withdrew it, (somehow erased that record), and then flagged spam; subsequently withdrawing that, to end up with the screenshot above.

I can see (?) that part of the bug appears to be in the dialog, my wonder is about flagging rude as spam; indeed a Moderator came to the chatroom to question that last week, and we explained that they were also sent a notification (with a portion of the clairification missing).

My Bug report seems to be a duplicate of AWegnerGitHub's.

Undo1 commented 5 years ago

Gonna unravel some stuff there, @RobCoding:

I was concerned that it wasn't flagged correctly and thought R/A would be more accurate.

We try not to worry about the difference between spam and R/A. They are the same flag, with a single minuscule edge case that doesn't really matter much. Don't worry about it.

It's... hard... to argue in favor of writing an algorithm and making a bot spend time calculating which of the two flag types to use when they are, in actuality, the exact same flag. It's not even Coke vs. Pepsi, it's Pepsi vs. a second bottle of Pepsi that's using a different but still valid Pepsi logo.

Retracting the spam flag alone retracted an invisible R/A flag (there were two flags simultaneously, one not shown)

Yeah, this is a weird quirk of the SE system that we're all used to. There were never two flags from you active on the post; only the spam flag - but when you retracted it, the system just prevents you from casting another red flag of any kind. It's odd behavior with a misleading message, but that's all it is.

indeed a Moderator came to the chatroom to question that last week, and we explained that they were also sent a notification (with a portion of the clairification missing).

Can you dig up a link to that? I like to stay on top of how people are trying to nitpick spam/RA, and missed this one.

RobCoding commented 5 years ago

@Undo1 - Part 1: OK, I won't worry about it. Part 2: "It's odd behavior with a misleading message", yes I can appreciate that it's partly SE; if people behaved oddly and said misleading things others wouldn't enjoy that, similarly there's little joy in the case where a computer (or machine, like a car/airplane) does that.

Dev-on-prod, Understood.

Part 3: The conversation started Jan 9 2019 with @DavidPostill, Moderator on Super User:

DavidPostill - \@SmokeDetector Regarding the flag you cast on this - it is not spam but has been deleted as Rude/Abusive. I'm not sure how else I'm supposed to respond to:

This post had 2 spam flag(s) cast on it by Charcoal members and has since been deleted, but was ultimately judged not to have been spam. Please review whether spam flags - and the penalty that comes with them - are appropriate for this post - you can let us know in chat.stackexchange.com/rooms/11540 if the flags were inappropriate. If you're wondering WTF this flag is, see charcoal-se.org/smokey/Auto-Mod-Flags for details

The conversation ends 3 hours later with the Moderator mostly satisfied:

Rob - @DavidPostill @Makyen & @JohnDvorak - I agree that the first line of that text would be better if it is: "This post had 2 R/A or spam flag(s) cast on it by Charcoal members ..." - since it's difficult to agree that the post in question is spam. Saying it's spam in more unclear cases could lead to a lot of headscratching, in that case it's clear enough that it's not spam but is undesirable for the reasons we have discussed here. Our automatic messages ought to be clear and helpful. --- Glad all David's concerns (except possibly editing the automatic response) were addressed in my absence and we have another satisfied customer. :) afk

ArtOfCode- commented 5 years ago

@RobCoding Changing the wording of that message is dead easy - I'll go make the change.

RobCoding commented 5 years ago

Not to nitpick but we don't want to leave the message unclear.

The next line says:

'not to have been spam. Please review whether spam flags - and the penalty that comes with them - are appropriate for this '\

The next line also needs to be reworded to explain that only spam flags are used but the post could have been R/A or spam, and regardless only a single flag is used (so don't reject the flags). That is what @AWegnerGitHub and @DavidPostill are concerned about, and is one part of this.

One possible rewording would be:

'not to have been spam. Only spam flags are used, regardless of whether the post is automatically flagged R/A or spam. Please review whether spam flags - and the penalty that comes with them - are appropriate for this '\

Part of the reason is that, as was given by the example "f*ing read the question before jumping up on it with wrong answers. idiot", and other examples I can edit in when I'm not due somewhere, is that an edit might be made to improve the answer. Example here: Repairable Offensive Posts. So we don't want rejections and user bans. GTG

DavidPostill commented 5 years ago

@RobCoding "f*ing read the question before jumping up on it with wrong answers. idiot" is not repairable, is against the ToC and should be deleted. If a first time user on SuperUser posted that I would immediately remove his account, in addition to checking it's not one of our known trolls.

The penalty attached to a spam flag doesn't make any difference, other that it allows a script I run to make it easier to nuke the luser.

RobCoding commented 5 years ago

In reply to a comment that came in while writing this, a repair to that comment is to simply remove the rudeness leaving only:

"read the question before jumping up on it with wrong answers"

Now instead of being "spam" or "R/A" it is "naa", the flag could be rejected because it was flagged with too great a priority and the person reviewing missed that it was edited. Quite frequently posts are edited after flagging and invalidate the flags. Along with ensuring that David is satisfied we also need to ensure that our explanation is suitable for the foreign language sites we work with, assuming a fair fluency in English.


@K-Davis1 complains about getting too many rejections on SO (where, sometimes, they are picky), though the cause in that case isn't this PR the rejections add to any garnered here.

I haven't enough time to search out the conversation, and I am unable to find the example in the time available, but a couple of months ago @makyen left a flag and the post was edited by a high reputation user, invalidating his flag. I managed to spot this and sent him a ping which he appreciated, he was able to withdraw his flag and reflag avoiding a rejection.

stale[bot] commented 4 years ago

This issue has been closed because it has had no recent activity. If this is still important, please add another comment and find someone with write permissions to reopen the issue. Thank you for your contributions.