Closed graphiclunarkid closed 10 years ago
So this is from a drop menu for people to choose from when they submit a site? Maybe also: News Social Media
Will this produce useful info? Wouldn't it be better to specify categories that might help us understand why filters are misapplying to them? Eg
Blog (lots of these get blocked) Sex education Teenage Forum Erotica (non-porn) Alcohol Tobacco Campaign
+447894498127 https://www.openrightsgroup.org
On 22 May 2014, at 16:08, RuthC notifications@github.com wrote:
So this is from a drop menu for people to choose from when they submit a site? Maybe also: News Social Media
— Reply to this email directly or view it on GitHub.
But doesn't that make an assumption about why the site is blocked that might be easy for the filtering companies to defend themselves from? I mean if a site is 'alcohol' then they can say 'yes it's for over 18s' I don't think we should be using the same categories as the filtering companies because that inherantly justifies categorising sites like that. We need to frame these sites in a different way, not just using their language. Tobacco is 'business' for instance.
I think for the purposes of presented the stats it's going to be much more powerful to say that x businesses were censored as oppposed to x alcohol related sites were banned.
I agree with @RuthC and @webal about the nature of the categories but I think we need a few more. Some practical examples from this week that I found difficult to categorise with the existing list:
Maybe to the existing list we could add:
And possibly "other" with a free-text box for people to enter their own?
I guess there will always be some overlap, we could allow people to select multiple categories, but I don't know if that would make things more or less clear :/
Is it possible for the blocked.org.uk to suggest a category for the URL?
On 22 May 2014, at 16:25, webal notifications@github.com wrote:
I guess there will always be some overlap, we could allow people to select multiple categories, but I don't know if that would make things more or less clear :/
OK, assuming that @RuthC and @webal seem to think this is to produce statistics about what incorrect blocks affect, then I think a number of the suggestions I made are very important. I don't think this is about framing, this is about documenting harm as accurately as possible.
For instance, I found it very useful to point out that alcohol-related sites are being blocked: nobody seriously thinks teenagers might visit a pub as the result of the presence of a website. Nor are they drinking over the Internet.
Sex education blocks is very important to know about. Campaign sites are vital to know if they are blocked. Teenage sites would be a harm to those who are supposed to be helped. Forums are disgracefully blocked, for no good reason, and are a "community harm".
So most of the categories, to me, should be there and don't cause a framing problem. Tobacco is a potential exception, and I'm happy to lose it. So just to resuggest:
Blog (lots of these get blocked) Sex education Teenage Forum Erotica (non-porn) Alcohol Tobacco Campaign
One other suggestion: this is something where Javier ought to be asked, in case he has a particular need for data.
Should there be an option to choose 'no reason to censor this' to highlight sites that should not have been blocked? This would allow ORG to keep a list of collateral damage that could be used to show the harm to people//companies.
I think that the categories we have kind of fall under two headings - the type of content owner (blog - persona;/business/charity/govt) and the content type (alcohol, porn, sex ed, etc). Perhaps having inputs for the two sets of categorisation might clarify this.
We could easily have multiple checkboxes to allow the users to 'tick all that apply' which may mean better categorisation, but might make analysis a little trickier as sites could overlap categories.
Yes, two questions is certainly an idea (i.e., who are you; what content do you have); but "no reason to censor" is very subjective. I don't see that alcohol should be censored, certainly sex education shouldn't be except maybe for u12s (the filters are generally u18). Porn isn't really an incorrect block and I would have thought is unlikely to be reported to us.
There might be some benefit in asking submitters to say whether they consider results to be under- or overblocked. @JimKillock is right to point out this is subjective, however if one of our aims is to argue that distinguishing between "good" and "bad" websites is impossible, I think receiving a range of opinions about controversial URLs (or categories) would serve to illustrate that point. Sites that divide opinion might also prove worthy of closer examination so this would be one way of discovering them.
@jimkillock We need to arrive at a decision swiftly so that any changes can be implemented in time for launch. I know Javier is very busy at the moment, but perhaps if you could speak with him, and then write up whatever you agree we could move forward?
Alternatively the three of us could talk by Skype and I can then write it up afterwards. I'm mostly free for the rest of the week.
Use two categories, as Javier suggested. Use one to categorise who they are, as per the present set up, and one more to categorise the kind of content (as above) so we get an idea why the site might be blocked.
I have one more request for the form which I will raise on another ticket.
OpenDNS has a pretty good list of categories that could be used to categories both under- and overblocking. There are a couple of omissions that I would add: personal sites (static homepages not blogs); and education (wider than "educational institutions"; to include how-to sites, community learning, code clubs, etc.)
(Aside: OpenDSN don't seem ready to license this data for reuse or to expose an API yet: their terms of service prevent scripted access. We could contact them and ask permission though).
The Wikipedia article on UK web blocking has a list of categories curated from those used by home ISPs. It's less comprehensive than the OpenDNS list and is more useful for categorising underblocked rather than overblocked sites IMO.
I have had a go at improving the site classification options. There are now three drop-down form elements:
You can see this in action on the http://stage.blocked.org.uk/ front page and the code is available in pull request #60. Please let me know what you think.
It does seem a bit focussed on types of site more likely to be blocked (on the third drop down)
@webal I tried to target the types of content ISPs are themselves targeting with their filters, but allowing for the possibility that they've got it wrong, which we're anticipating.
The trouble with trying to be more comprehensive is that we risk ending up with a list of all human activity - a long list! If we use broader categories to avoid exhaustive detail we risk the list becoming so vague as to be meaningless. "Entertainment" is already a meaninglessly large category, for example, but I left it there to cover media sites (film, music, books, etc) that might be blocked for copyright infringement.
It's good that the field is optional IMO - if none of the categories apply people can just leave it blank.
I'm very happy to entertain suggestions for changes to the content list in particular. Also the others if anyone thinks we could improve them.
We have a short list of fairly generic categories of site at the moment:
What other categories should we list? These need to be added.