Option to configure filtering strictness

bmaupin commented 11 months ago

First of all, thanks for this! This is a really amazing extension. I came here from https://github.com/nsfw-filter/nsfw-filter which apparently no longer works for Firefox.

As a Firefox user, I haven't used nsfw-filter, but I can see it has exactly what I'm looking for: some kind of configurable filter strictness:

nsfw-filter-filter-strictness

When I first enabled this plugin it didn't take me long before I wanted stricter filtering on some sites. I figured it would be as simple as going to the settings. Unfortunately I see the default setting is

Automatic - Automatically change zones based on how many unsafe images have been detected recently

That's problematic because it means

I have to wait for unsafe images to be detected until the zone is changed
"recently" implies that this may change

So the next thing I looked for is some way to blacklist certain sites. Unfortunately that doesn't appear to have been implemented yet (#138).

So my only option at the moment it appears is I can set the global zone to Untrusted. This works, but it's way too aggressive. Given that every image that's blocked has a number, it seems like it should be possible to tweak the filter by that number.

I would also be nice to have a way to submit false positives and false negatives. I guess that's covered by #40?

Lastly, I'm a developer and could potentially contribute to this issue or #138. Would PRs be accepted? If so, any information to point me in the right direction as to how to get started on this would be appreciated (although not necessary).

Thanks!

wingman-jr-addon commented 11 months ago

Hello @bmaupin ! You touch on a lot of things in the issue, with the core being around the ability to set the filter strictness. Let me address a couple of the smaller things and then return to that as it is a bit longer discussion.

Regarding blacklisting: yes, that is not implemented. I see you had visited the right issue #184 so I believe that's meant to be linked in your comment rather than #138 in case anybody else is looking. There is actually a static whitelist now over here: https://github.com/wingman-jr-addon/wingman_jr/blob/d8035027a0a4db5b4b73bb75cd513bc20ebbc8d2/whitelist.js#L1 With respect to PR's - yes, but something like this may get a bit interesting as it propagates out to the processor(s). There is also currently no UI or build framework and I'm quite stingy about dependencies as this works better for the submission process. If you have an idea in mind, let's maybe continue the discussion over on #184 before trying to work on something.

Regarding submitting feedback, yes you are correct that's #40 .

Now to return to the main question about slider sensitivity. Is it possible? Yes it certainly is. However, it's not the out of the box solution for a couple reasons.

The first pertains to the model design. To understand the situation a bit better, if you have not checked out the model design discussion over here: https://github.com/wingman-jr-addon/model/tree/master/sqrxr_112 I'd recommend taking a peek. In summary, the model is trained in two phases: first, a classifier is trained with a couple extra losses to help get the right behavior. Second, the top layer is stripped off and the image features are fed into a binary classifier for "safe" vs. "not safe" optimizing against the ROC AUC. Finally, the three zones selected are based on the ROC curve: which comes to the important point. The three spots chosen represent sweet spots on the curve. You could probably create a couple more gradations between "trusted" and "untrusted" . Checkout roc.js.

The second is related to the psychology of the tool a bit. I think you could up the number of zones to say, five or maybe even seven, but beyond that I think the meaning starts to get a bit lost and you'd have to think continuous like a slider. But what happens then? Well, some folks such as yourself may enjoy the freedom. However, the presentation of too much choice may encourage tinkering. Now in general, I like tinkering, but in this case the tinkering may encourage folks to "play around" with the settings on this point and visit increasingly risk sites to check that it's working. Some of this may be inherent to the addon, but I try to minimize the amount that folks are interacting with the type of content they are trying to avoid.

So what do you think might help? As you can see I'm a bit wary of having too much configurability.

Also, you may be interested in bkSetAllLogging(true) if you haven't played with that yet.

Thanks for taking the time to leave feedback!

bmaupin commented 11 months ago

I try to minimize the amount that folks are interacting with the type of content they are trying to avoid.

That makes sense.

It feels like there are so many ways to tackle this that it's hard to know what the best approach should be.

Maybe a better way to approach this would be to describe my experience and ask you what the best approach should be to get better usage of this plugin.

I feel like the default "Automatic" behaviour is what I would want. But it doesn't seem to work very well for me, because I've never seen it actually do anything. Maybe I'm just not waiting long enough? I tried switching back to Automatic and one site says 25/2069 images blocked. I would prefer it to be more strict on that site, but besides just waiting for something to happen I don't know what else to do.

So one approach might be to somehow improve the behaviour of Automatic, or at least make it more clear how it works. Does it have to block a certain number of images to take effect? Or is it a ratio? The description says it's based on images that have been detected "recently"; how often is that? And what's the default zone used? I can hide images manually but that doesn't seem to increase the count or show me the blocked image icon afterward; is that supposed to be integrated with the functionality of Automatic?

But then switching from Automatic to Untrusted gets me a ton of false positives. Is there any measurement as to how accurate the model is? nsfw-filter, by comparison, has several different models to choose from, and the nsfwjs library it uses shows the accuracy of those models. Would another approach be to build this plugin on nsfwjs instead? That seems pretty drastic 😅

If changing the underlying model isn't an option, how could the existing model be improved? There needs to be a way to contribute false positives/negatives (#40), but of course that presents its own challenges.

But maybe the easiest approach would just be to implement a whitelist/blacklist (#184). I think that would probably work for me; I would probably leave the default Automatic setting as-is in that case (although it would still be nice to know how it worked 😄) and then just blacklist sites on a case-by-case basis.

What do you think?

Thanks!

wingman-jr-addon commented 11 months ago

Once again, lots of things here:

Regarding Automatic - it's doing it's job; what it helps do is clamp down relatively quickly if you stumble on an extra bad site
Regarding how to know what Automatic is doing - good question. I don't think simply explaining it will be satisfying to most users. While I'm hesitant to provide sliders for cutoffs, I do think a "temperature" gauge of some sort to show what Automatic is doing could be helpful. Thoughts?
Regarding accuracy: I encourage you to read through the model SQRXR 112 details. I believe for this particular use case, simply looking at accuracy is actually gaming the system fairly badly as the population of images is so diverse. While accuracy is what we usually gravitate towards, we don't treat false positives and false negatives with the same weight, meaning accuracy is in a certain sense always the wrong measure.
Regarding model improvement: absolutely, but as noted feedback is a hard problem.
Regarding blacklist/whitelist - maybe comment on how you think this could best be implemented over on #184

bmaupin commented 11 months ago

First of all, thanks for your patience! I am slow to process things, so I'm still trying to figure out what it is I'm actually looking for.

While I'm hesitant to provide sliders for cutoffs, I do think a "temperature" gauge of some sort to show what Automatic is doing could be helpful. Thoughts?

For my particular use case I don't think I would get much benefit from this. But it might be helpful for others if it's not too much effort.

I think whitelist/blacklist will definitely be a huge improvement, so I'll move conversation related to that to #184.

I think improving the model would be a huge win for everyone, but that doesn't seem like something that's easily done so I'll let that go.

Beyond that, I have 3 ideas:

Idea 1

I think a simple change to the Automatic description could help people like me where the current description just creates more questions. Even your description above helped me understand it better:

Regarding Automatic - it's doing it's job; what it helps do is clamp down relatively quickly if you stumble on an extra bad site

So based on that, I came up with this:

Automatically change zones if a large number of unsafe images are detected in a very short amount of time

Or as an alternative, I just appended your comment to the existing description:

Automatically change zones based on how many unsafe images have been detected recently. The purpose is to clamp down relatively quickly if you stumble on an extra bad site.

Idea 2

Right now Automatic is all-or-nothing; either I can get that behaviour or I can choose a zone. I can't choose a zone and still get the automatic behaviour.

Maybe it would be nice if these two concepts could be split? So for example, Automatic would be an on/off choice. And beyond that, the user could choose the zone. With Automatic off, the current behaviour doesn't change. With Automatic on, it would keep the current behaviour with the only difference being that the starting zone would be the zone the user selected.

Idea 3

I originally opened this issue because I felt there was a huge gap between Neutral and Untrusted zones. You'd mentioned adding more gradations. It feels like that would help out quite a bit, at least in my case.

I understand not wanting to encourage users to tweak, but given that the definition of "undesired" is going to naturally vary from person to person, I think some initial tweaking will be normal. But while initial tweaking might be helpful, ongoing tweaking would not. Having these options buried in the settings (as is currently the case) as opposed to being in a quick menu (like the nsfw-filter plugin) might naturally discourage users from going back and tweaking them later.

Having more gradations between Neutral and Untrusted feels like it would be the biggest win out of the three ideas for me since it's the reason I originally opened this issue.

Having said all of that, maybe as you suggested I should play with the values in roc.js first. I did go and read the model documentation you referenced. While most of it is admittedly over my head, I did get the sense that by lowering the number of false positives I'd simply be increasing the number of false negatives, in which case I'd probably rather live with false positives than the opposite.

Given that the number of sites that I find problematic and that I also have to interact with on a regular basis is relatively small, the current behaviour plus blacklist/whitelist functionality might be enough for me.

Thanks!

bmaupin commented 10 months ago

I finally got around to testing nsfw-filter filter for myself. It was very nice to have a slider for filtering strictness, but the model seems to be trained differently. While I saw barely any false positives even when setting the filtering strictness to 100%, it also caught less than Wingman Jr's untrusted setting.

That further reinforces my impression that reducing false positives would likely only serve to increase false negatives.

I still think there could be some room for improvement, and in particular it would be nice to have more gradations between neutral and untrusted. Just the other day, this extension was blocking some map tiles in Google Maps 😆 Untrusted has ended up being so extreme that I've had to stop using it. But I think the biggest payoff will probably be whitelist/blacklist functionality.

If you are open to any of my ideas above I'd be more than happy to work on pull requests as I'm able.

Thanks!

wingman-jr-addon commented 10 months ago

Say, thanks for the feedback on the user experience, also as relative to nsfw-filter. Yes, the model is trained quite differently, with the focus being on categories. If you have specific discussions about the model itself, glad to chat - but probably go post with the relevant discussion item over at https://github.com/wingman-jr-addon/model. I agree there is definitely room for improvement, and particularly on untrusted you have a false positive rate of nearly 10% - it's not good to use for normal browsing.

Regarding the whitelist/blacklist functionality, I see you've commented over on #184 again so I'll go pick up the conversation there.

Returning a bit closer to the original topic of discussion, I've been mulling over something you mentioned earlier: the switching between zones. I think this is directly tied into the strictness feature, and as you noted doesn't have particularly good visibility. Particularly in your case it sounded like you had expected it to transition more quickly between the zones. What do you think would be a good way to give visibility to and provide a mechanism for adjusting this transition sensitivity?

bmaupin commented 10 months ago

Regarding the automatic zone switching, I think I misunderstood what it's for and so my expectations didn't match the functionality. What you said above makes perfect sense:

what it helps do is clamp down relatively quickly if you stumble on an extra bad site

Before that, I was expecting that if the filter blocked a certain number of images on a site, even over an extended period of time, that it would change the site's zone. But I see now it isn't meant for that.

As far as visibility is concerned, it looks like there's a popover on the extension icon that will show the number of images blocked out of the total. Maybe showing which zone the site is in would make it more clear the automatic functionality is working?

But as I mentioned above, simply updating the description in the preferences would have helped me set my expectations.

I have other ideas that would make the automatic behaviour more useful to me personally, but I'm not sure how helpful they'd be to others, and as you mentioned they would probably encourage tinkering.

... having said that, one idea I had would both improve visibility and configurability of the automatic behaviour. Even now it's not clear to me what the default zone of the automatic behaviour is. If this was a dropdown, then not only would it be clear what the default zone is, it would allow users to change it. But again, it may not be worth the effort for this one small use case when I suspect whitelist behaviour should get me what I'm looking for in the end.

bmaupin commented 10 months ago

While I was working on a POC for the whitelist, I noticed the coloured app icons in one of the directories. I've never seen those before because Firefox hid the app icon by default and I never bothered to show it (also I prefer a less cluttered UX). But that piqued my curiosity and I now see that the extension does indeed give immediate feedback regarding zones when automatic is set.

Given that behaviour, I'd say visibility is already pretty good as-is. It seems to be clear just from using it to get a sense of how it works, provided the app icon isn't hidden 😅

I can see now I was missing a major part of the UI and based on that, I think most of my suggestions are no longer valid. Sorry for the confusion.

wingman-jr-addon commented 10 months ago

No worries! I started developing this back when they showed all the icons by default and then they changed to the current behavior. Hide-by-default is NOT my favorite choice; I'd prefer to opt-in to the hiding for the exact reason you've encountered.

wingman-jr-addon / wingman_jr