openrightsgroup / cmp-issues

Centralised issue-tracking for the Blocked backend
2 stars 0 forks source link

Interesting results for IWF/CTIRU canary domains #261

Closed alexhaydock closed 3 years ago

alexhaydock commented 4 years ago

Both the IWF and CTIRU seem to maintain canary domains on their filter lists for use with this project I found via the UK Safer Internet Centre guidelines on appropriate filtering and monitoring.

When I run them through Blocked though, the domains are not reporting themselves to be filtered except just the CTIRU one and only on BT-Light.

https://www.blocked.org.uk/site/http://iwf.testfiltering.com https://www.blocked.org.uk/site/http://ctiru.testfiltering.com

What's going on here? Surely we could expect all the ISPs to be using the IWF/CTIRU lists? Or did you specifically ignore canary domains somewhere?

gwire commented 4 years ago

I think the most likely reason is that the "block" identifying code doesn't include patterns for IWF and CTIRU blocks, since neither have been a focus of the project. If a request responds with a 200 status, it looks OK unless it matches the property of known block pages.

alexhaydock commented 4 years ago

I think the most likely reason is that the "block" identifying code doesn't include patterns for IWF and CTIRU blocks, since neither have been a focus of the project. If a request responds with a 200 status, it looks OK unless it matches the property of known block pages.

This makes sense, though my understanding was that if the domain was blocked the way I had thought the IWF/CTIRU filters worked, then it would never get to the stage of getting a HTTP 200 back. Even if the Blocked tool wouldn't recognise it as a block if a block page wasn't returned, I'd still expect a timeout or similar rather than a success?

dantheta commented 4 years ago

Our test lines don't seem to have the ctiru filter active. The IWF test page returns a 403 on pages which have IWF filtering active. We'd previously decided against adding a detection pattern for IWF-filtered pages, and given the subject material I think that's definitely right. Should we detect and record CTIRU responses?

alexhaydock commented 4 years ago

I mostly opened this issue to find out what was going on on the backend so thanks @dantheta - makes sense!

Totally understand the position Re: IWF, though I personally think it'd be interesting to detect and record the CTIRU responses, even if only done internally for now. I guess that's @JimKillock's call though.

JimKillock commented 4 years ago

The issue with CTIRU is that the list will not be distinguishable. As you can see from the CTIRU canary links, you just get the standard filter block pages.

AIUI the CTIRU blocks are not applied to non-filtered networks.

alexhaydock commented 4 years ago

Slightly dated source now, but yes I'm forgetting that the CTIRU filter list isn't applied widely like the IWF one:

The filtering list is provided to companies who supply filtering products across the public estate, including schools and libraries. This means that the URLs on the list can still be accessed on private computers or devices outside the public estate. (from FOI)

I guess we don't have any network lines which will have those kinds of filters active.

JimKillock commented 4 years ago

Even where we do have access to a line with CTIRU lists added, you cannot tell what is blocked due to the CTIRU list. There is no transparency anywhere AIUI.

It looks from our results that the main ISP filters incorporate the CTIRU list, for instance.

The canary presumably just serves to show a user that the CTIRU list has been incorporated, or not.

No result, even on the government estate will say “This is blocked due to CTIRU”. It might say “Classified as extremism / hate” - but it would be indistinguishable from other list sources such as a Symantec classification that the school or hospital is using.