When TLS helps or does not help

teirdes commented 4 years ago

In last-call there was a comment proposing to distinguish between those cases where techniques employed for censorship are thwarted by the use of TLS, and when techniques employed for censorship may work in spite of TLS. Requires more work.

teirdes commented 4 years ago

I have been thinking about how to resolve this comment, and I think the mentions of whether TLS or does not help might need to go into the Trade-off subparagraphs. But Chelsea Komlo also proposed to rename these paragraphs something like "Cost to implementor". Incorporating this change into the draft would allow the addition of more text in section 1 that could justify adding specific information about TLS effects on circumvention.

Proposal:

Replace all occurences of "Tradeoffs:" with "Cost to implementor:"

Proposal:

In section 1 introduction, second paragraph, replace: "There is also a growing field of academic study of censorship circumvention (see the review article of [Tschantz-2016]), results from which we seek to make relevant here for protocol designers and implementers." with "There is also a growing field of academic study of censorship circumvention (see the review article of [Tschantz-2016]), results from which is listed here when circumvention has implications for internet protocols. Censorship circumvention also impacts the cost of implementation of a censorship measure, and we include mentions of tradeoffs in relation to such costs in conjunction with each technical method identified. Additionally, with more and more application layer features leveraging encryption as a means of by-passing protocol-dependent security mechanisms in middleware [1], we have tried to incorporate observations on when encryption techniques like TLS (HTTPS) are effective not only as vehicles for technological improvements but also as counter-measures against censorship described in this document."

[1] This could reference many things, like Stenberg at FOSDEM 2020, the MPTCP drafts (which are like 1/3 content on ensuring middleware doesn't block progress), the original QUIC concept note from 2015, etc. Possibly all of them?

Proposal:

In section 3.2.2, replace Trade-offs section with the following:

"As with HTTP Request Header Identification, the techniques used to identify HTTP traffic are well-known, cheap, and relatively easy to implement. ~~However, they are made useless by HTTPS because HTTPS encrypts the response and its headers.~~

The response fields are ~~also~~ less helpful for identifying content than request fields, as "Server" could easily be identified using HTTP Request Header identification, and "Via" is rarely relevant. HTTP Response censorship mechanisms normally let the first n packets through while the mirrored traffic is being processed; this may allow some content through and the user may be able to detect that the censor is actively interfering with undesirable content. Similar to HTTP Request Header Identification, ancillary techniques need to be deployed in the presence of HTTPS, since HTTPS encrypts the response and its headers."

Proposal:

In section 3.2.3, make the following substitution in the second paragraph. Replace "For example, the censor can gain insight about the content of encrypted traffic by coercing web sites to identify restricted content." with "TLS or HTTPS is not necessarily a robust protection mechanism against this form of censorship. For example, a censor can gain insight about the content of encrypted traffic by coercing web sites to identify restricted content. "

Proposal:

In section 3.2.4, move the following sentence from Cost to implementor-section to the Empirical examples section: "The Great Firewall of China (GFW), the largest censorship system in the world, uses DPI to identify restricted content over HTTP and DNS and inject TCP RSTs and bad DNS responses, respectively, into connections [Crandall-2010] [Clayton-2006] [Anonymous-2014]."

In section 3.2.4, add the following sentence at the end of the Cost to implementor-section: "TLS or HTTPS has not always ensured a robust protection against DPI-based censorship. A sufficiently resourced adversary may, for instance, use a number of techniques to circumvent TLS[2]. Transport Layer Security (TLS) Protocol Version 1.3 [RFC8446] introduces more robust security, but also sparked extensive discussions on the merits of content visibility to third parties such as censors [draft-rhrd-tls-tls13-visibility-01]."

[2] https://blog.cryptographyengineering.com/2013/12/03/how-does-nsa-break-ssl/

Basically the last two paragraphs of 3.2.4 would then read:

"Despite these problems, DPI is the most powerful identification method and is widely used in practice. TLS or HTTPS has not always ensured a robust protection against DPI-based censorship. A sufficiently resourced adversary may, for instance, use a number of techniques to circumvent TLS[2]. Transport Layer Security (TLS) Protocol Version 1.3 [RFC8446] introduces more robust security, but also sparked extensive discussions on the merits of content visibility to third parties such as censors [draft-rhrd-tls-tls13-visibility-01].

Empirical Examples: Several studies have found evidence of censors using DPI for censoring content and tools. The Great Firewall of China (GFW), the largest censorship system in the world, uses DPI to identify restricted content over HTTP and DNS and inject TCP RSTs and bad DNS responses, respectively, into connections [Crandall-2010] [Clayton-2006] [Anonymous-2014]."

josephlhall commented 4 years ago

This is big enough that I'm going to wait for -05 version rather than rush through it now.

JoGSal commented 3 years ago

Hi @develra could you take a look at the proposed text above for clarity and accuracy? Hopefully the edit can be as is!

mallory commented 2 years ago

Won't fix: Left this for another document but said so in the introduction.

IRTF-PEARG / rfc-censorship-tech