openrightsgroup / cmp-issues

Centralised issue-tracking for the Blocked backend
2 stars 0 forks source link

Copyright blocks may not be registering #24

Closed JimKillock closed 10 years ago

JimKillock commented 10 years ago

When I try URLs such as http://thepiratebay.se the probes do not always seem to be picking up blocks:

BT ok 2014-06-06 10:33:15 No record of prior block JISC Collections And Janet Limited ok 2014-05-29 15:18:13 No record of prior block O2 ok 2014-05-28 18:37:49 No record of prior block Pro-Net Internet Services Ltd ok 2014-06-05 20:52:03 No record of prior block Sky blocked 2014-06-06 10:33:15 2014-06-06 10:33:15 T-Mobile ok 2014-05-28 19:09:47 No record of prior block TalkTalk ok 2014-06-06 10:33:16 No record of prior block TeliaSonera Norge AS ok 2014-06-05 13:19:31 No record of prior block Three blocked 2014-05-28 18:37:52 2014-05-28 18:37:52 VirginMedia blocked 2014-06-06 10:33:16 2014-06-06 10:33:16 VirginMobile ok 2014-05-17 17:39:54 No record of prior block Vodafone ok 2014-05-28 19:40:46 No record of prior block

JimKillock commented 10 years ago

http://fenopy.se gives similar results:

BT ok 2014-06-06 10:49:27 No record of prior block Sky blocked 2014-06-06 10:49:27 2014-06-06 10:49:27 TalkTalk ok 2014-06-06 10:49:27 No record of prior block VirginMedia blocked 2014-06-06 10:49:27 2014-06-06 10:49:27 VirginMobile ok 2014-05-17 20:01:24 No record of prior block

Sorry if this is a known issue

graphiclunarkid commented 10 years ago

This might be due to the way the pyprobe and android probe detect blocks. The configuration files look for specific responses that indicate the site has been filtered. If ISPs respond differently for copyright censorship than they do for "parental controls" style filtering we'll potentially record that as ok.

If an ooniprobe shows these sites as blocked that would tend to confirm my theory since it grabs two copies, one direct and one through Tor, and compares the results.

In fact we could probably distinguish between (c) blocks and parental controls by looking for URLs that ooniprobe says are blocked but pyprobe says are ok...

@dantheta: is it possible to update the config files so that probes treat as blocked the ISP responses to URLs like the above?

dantheta commented 10 years ago

The rules in the config files are very much tailored towards the parental control blocking rather than court-ordered blocks.

For some ISPs it is possible to configure the probes to detect court-ordered blocks. Virgin and Sky show a minimal HTML forbidden message when the user tries to access a blocked site. For some other ISPs the sites blocked by court order are simply null-routed, resulting in a connection timeout. For these, the only way to differentiate a routing error from deliberate blocking is to compare the result with AAISP (or perhaps TOR).

Ooni-probe will flag these as a difference, but is subject to a higher rate of false-positives for the same reason.

NetworksAreMadeOfString commented 10 years ago

Interestingly neither piratebay nor fenopy are blocking on the EE default PAYG level filter

JISC Collections And Janet Limited, O2 & Pro-Net Internet Services Ltd are most likely to be me and are unlikely to have blocks; JA.net, MyNow.co.uk and O2 business (no filters)

graphiclunarkid commented 10 years ago

Do we need a FAQ page that explains why our tool might say a site isn't blocked even though someone might not be able to access it (or vice versa)?

dantheta commented 10 years ago

I think we can adapt the rules to spot copyright blocks (where those sites aren't simply null-routed) for most networks. A FAQ page may be handy.

JimKillock commented 10 years ago

Null routing is beyond poor practice for legal blocks. If we spot this practice, it'd be really helpful for us to identify it and complain to the relevant ISPs about it.

dantheta commented 10 years ago

On BT, making an HTTPS request for a site which is blocked by their parental controls results in a timeout. The target IP that they direct traffic to drops incoming HTTPS packets.

So on BT, getting a blocked HTTPS URL is indistinguishable from a timeout (unless you are also checking the target IP returned by their DNS hijack, and then you can infer that it was blocked).

dantheta commented 10 years ago

I'm testing out rules for BT, TalkTalk and Plusnet to catch court-ordered blocking. VirginMedia's court block page was already in config, and Sky's parental controls catch thepiratebay before the court-ordered filtering kicks in. They'd categorized TPB and fenopy as "ANONYMIZERS" and blocked them.

JimKillock commented 10 years ago

Thanks Dantheta. When they are in the database, will they be flagged (to the user) differently?

dantheta commented 10 years ago

Not at the moment - it would be a good thing to record though. I've got a feeling it would require a non-backwards-compatible change to the rules format though.

I was also going to ask if we would want to be able to display the type of block to users as well (when we're recording it)?

dantheta commented 10 years ago

BT, TalkTalk and Plusnet now have copyright block rules in Beta. I'll need to co-ordinate a rollout with Android-Probe, since it uses a new rule type that the android probe may not have support for yet. The pyprobes on the A&A VMs are using this config for the moment. VirginMedia had a copyright rule from the start. I've added a rule for Sky based on the ooni log data, but the parental blocking trumps the court-ordered blocking, so we may not see many of those.

graphiclunarkid commented 10 years ago

This would be a good reason for volunteers to run probes even if they are on unfiltered lines: we can use their results to detect other kinds of (mandatory) censorship e.g. copyright orders.

dantheta commented 10 years ago

I've now added a copyright rule for O2. Vodafone doesn't block TPB. Three does, but it emits a simple 403 error. I don't think we'll be able to process that into an OK/Blocked response. EE was already mentioned above. That just leaves VirginMobile, out of the regular set.

dantheta commented 10 years ago

It looks like it's safe to roll this version of the config out. It won't impact the android probe.

dantheta commented 10 years ago

I've tested and deployed the copyright block rules. Are you happy for this ticket to be closed, or would you like to hold on until we can get confirmation and config for VirginMobile?

graphiclunarkid commented 10 years ago

Close it if you're happy. We can always open a new one if we find further problems.

dantheta commented 10 years ago

That's cool.

JimKillock commented 10 years ago

Still seems to be an issue:

https://new.blocked.org.uk/results?url=http://www.piratebay.se

This is still showing “ok” results for BT: but seems to be functioning elsewhere for BT

https://new.blocked.org.uk/results?url=http%3A%2F%2Ffenopy.se

graphiclunarkid commented 10 years ago

Did you resubmit the site for checking or just visit the results page?

When I checked the result via the link you posted it said the last check for http://www.piratebay.se/ was on 2014-06-03 01:17:15.

I just resubmitted the URL via the front page and it's now showing as blocked...

dantheta commented 10 years ago

I'm pretty sure it was showing an old result from before this fix was rolled out. At the time that the most recent set of status requests came in (17:37), the most recent result was from 2014-06-03.

At the moment, the recheck period on the list of URLs we have is around 50 days. We'll be able to improve this considerably once the mobile probes are online. Unless we keep a separate recheck date for each ISP (or group of ISPs), we need the pyprobes to have roughly the same capacity, otherwise the per-ISP recheck queues get full. While the mobile probes were running off PAYG, the recheck capacity was very limited indeed!.