Blocking of fully encrypted protocols (Shadowsocks, VMess) in Russia, targeting HTTPS traffic fingerprints

net4people / bbs

Forum for discussing Internet censorship circumvention

3.19k stars 75 forks source link

Blocking of fully encrypted protocols (Shadowsocks, VMess) in Russia, targeting HTTPS traffic fingerprints #363

Open wkrp opened 1 month ago

wkrp commented 1 month ago

There's a recent thread on NTC about Shadowsocks and VMess being blocked in Russia, on the basis of encapsulated HTTPS exchanges.

Неработоспособность шифрованных протоколов (ShadowSocks/VMESS) (25.05.2024 +) Inoperability of encrypted protocols (ShadowSocks/VMESS) (2024-04-25 +)

There's additional discussion in another thread starting about here.

The blocking appears to be based, not on characteristics of the cover protocol, but on HTTPS traffic patterns inside the tunnel, like in "Fingerprinting Obfuscated Proxy Traffic with Encapsulated TLS Handshakes". One blocking pattern seems to be: the client sends at least 3 packets, each of which is 411 bytes or larger, and the server sends packets more frequently than the client (the server's packet sizes don't matter). Plain HTTP connections inside the tunnel are not blocked, because with them, the client sends fewer than 3 packets. curl with mbedtls does not trigger a block, because its ClientHello packet is smaller than the 411 byte threshold.

us254 commented 1 month ago

it seems the censors are using the following pattern to detect and block obfuscated proxy traffic like Shadowsocks and VMESS:

1. The client sends at least 3 packets, each of which is 411 bytes or larger.

2. The server sends packets more frequently than the client (the server's packet sizes don't matter).

The blocking appears to be based on the ratio of the size of the packets sent by the client compared to those received from the server, rather than absolute packet sizes.

Some key points:

The blocking was observed on major Russian mobile providers like Tele2, Megafon, MTS, Beeline and Yota in St. Petersburg. It affected traffic to foreign destinations.
HTTP traffic may not trigger the block because it typically has only one large client packet (the HTTP request), while the pattern expects at least two (e.g. ClientHello + HTTP request).
Adding a prefix to the Shadowsocks traffic to make the first packet look like a TLS ClientHello (over 411 bytes) helped avoid the blocking in some cases.
The blocking seems to only apply to TCP traffic. UDP traffic using QUIC did not appear to be blocked.
There are indications the blocking stopped in some regions like Bashkortostan and Tatarstan after April 30th, while it persisted in others.

the censors appear to be fingerprinting the traffic pattern of obfuscated proxy protocols, specifically looking for multiple large client packets and more frequent responses from the server. This allows blocking without needing to decrypt the traffic. Varying the traffic pattern, such as by adding dummy packets or varying packet sizes, may help circumvent this detection.

fortuna commented 1 month ago

I'm assuming you mean 3 packets after the TCP handshake.

The packet size signature depends on the Shadowsocks implementation. It would be helpful to distinguish them

Many Shadowsocks implementations will send the IV and the connect request before the application data. Those are smaller than 411 bytes. Does it mean they won't be blocked?

With Outline, we merge the IV, the connect request and the initial data in one packet. Less packets, but the first one will be larger.

Do they all get blocked?

Also, because it needs 3 packets, does it mean only TLS 1.2 gets blocked, but not TLS 1.3? I guess it depends on the SS implementation?

Thanks for the reports, but this is still quite confusing, we need some more clarity.

fortuna commented 1 month ago

By the way, the opt-in traffic metrics from Outline Servers show slow drop in traffic from Russia after April 21, stabilizing after ~April 19:

wkrp commented 1 month ago

@fortuna has done investigation and observed that the blocking of Shadowsocks-like protocols can depend on the remote server address range and the server port number.

https://ntc.party/t/7776/27

I ran some tests with the Outline SDK, which may work differently than other implementations. It looks like the blocking depends on the location of the server. It also depends on the port number. I was also able to confirm that the initial packet size makes a difference, but only in some ISPs.

Bee Line seems to be the only one considering the packet size, but only for the Vultr server, not DigitalOcean.

MTS blocked Shadowsocks access to a server on DigitalOcean, but not Vultr. They are likely using the IP address.

Blocking on DigitalOcean only happened for the key on port 443. No blocking for the key on the same server, but on port 5555.

MegaFon blocked Shadowsocks access to a server on Vultr, but not on DigitalOcean. They are likely using the IP address as well.

The packet size didn’t make a difference for MegaFon and MTS

Tele2 is not blocking.

There are details of specific tests in the linked NTC post.

Detection only affecting certain server address ranges is similar to what happened with the blocking of fully encrypted protocols in China in 2023:

https://gfw.report/publications/usenixsecurity23/en/#6-2-not-all-subnets-ases-are-affected-equally

6.2 Not All Subnets/ASes are Affected Equally

Of the 5.5 million processed IPs, 98% of them are unaffected by the GFW’s blocking, suggesting that China is fairly conservative in employing this new censorship.

Figure 4 shows the top affected ASes. While this is skewed toward larger ASes (which have more IPs in our scan), it shows both ASes that are heavily affected (e.g., Alibaba US, Constant) and ones that are not (Akamai, Cloudflare). In addition, some ASes have a mix of affected and not affected prefixes (Amazon, Digital Ocean, Linode). All of the affected or partly-affected ASes we see are popular VPS providers that could be used to host proxy servers while large unaffected ASes do not typically sell VPS hosting to individual customers (e.g. CDNs).

irgfw commented 1 month ago

We have observed the same blockings in Iran, except for the "port" part. IRGFW doesn't care about the Port number in most cases. But the AS whitelisting is happening in Iran. Most protocols on data centers, like Azure or AWS, won't get blocked, but the same configuration will be blocked on famous ones like Hetzner, DigitalOcean, and Linode,...

However, to the extent of Shadowsocks and VMESS, all VLESS (with or without TLS) combinations are affected, too.

fortuna commented 1 month ago

I've done further investigation. Please find the results on this Gist, which allows you to filter by client ISP, server network, HTTPS, ...

Each file is a different transport I used. $key.tsv means a direct connection to the Outline Server. $key?prefix=....tsv uses the corresponding prefix. split:..|$key.tsv uses TCP stream splitting at the corresponding position (the number is the length of the first segment).

There are some remarkable findings:

The blocking behaves differently based on the tunneled application layer traffic. Tunneled HTTPS seems targeted.
TCP stream splitting affected the blocking differently in different cloud providers, and depending on the split position. Combining 5 and 300 splits did not improve evasion.
The POST%20 and a TLS prefix with a message length greater than the record length (TLS Record Fragmentation) bypasses almost all blocking. Prefix FOOBAR%20 helped in some cases and made things worse in others. This suggests that the prefix should look like a known protocol, since just not looking random is not as effective.
There seems to be multiple blocking mechanisms, given the different errors and how they react to different strategies. It would be helpful if the community could help characterize them all.

For now, Outline service providers should use one of the working prefixes. It can also help to provide servers on different cloud providers, and on a high port number in addition to 443.

fortuna commented 1 month ago

Here is a new dataset I generated where I put all in one single table: https://gist.github.com/fortuna/41848697f0be93b2c2e222cd83096fcb

With the new dataset, I was able to generate this binary tree that characterizes the blocking in Russia:

Orange is no blocking ("no error"), blue is blocking ("error").

fortuna commented 1 month ago

It seems that tree training was putting aside some training data.

Here is a tree with the full dataset and in SVG: decision_tree_corrected (1)

The class is the curl exit codes.

fortuna commented 1 month ago

Alternative view with only OK, TIMEDOUT and ERROR. classifier_tree (1)

wkrp commented 1 month ago

Alternative view with only OK, TIMEDOUT and ERROR.

Ok, so if I interpret this, the root node has the condition isp_Tele2 Russia ≤ 0.5. So if isp_Tele2 Russia = 1 (the ISP is Tele2), then we go right and hit a leaf with class = OK. In other words, there is no blocking on Tele2, which agrees with the table. If isp_Tele2 Russia = 0 (the ISP is not Tele2), then we go left.

From there, the condition is server_port_5555 ≤ 0.5, so if the server port is 5555, we go right; otherwise we go left, and so on.

It looks like every ISP then has a mini decision tree, something along the lines of Ex1–Ex5 in China:

Allow a connection to continue if the first TCP payload (pkt) sent by the client satisfies any of the following exemptions:

Ex1: popcount(pkt)/len(pkt)≤3.4 or popcount(pkt)/len(pkt)≥4.6.

Ex2: The first six (or more) bytes of pkt are [0x20,0x7e].

Ex3: More than 50% of pkt’s bytes are [0x20,0x7e].

Ex4: More than 20 contiguous bytes of pkt are [0x20,0x7e].

Ex5: It matches the protocol fingerprint for TLS or HTTP.

Block if none of the above hold.

For example, by inspection, it looks like the only failure cases for MTS PJSC are when the server is on Digital Ocean and the port is 80 or 443. It's independent of the strategy column. So the tree for MTS PJSC would be:

if (server_net == "Digital Ocean")
    if (port == 80 || port == 443)
        return TIMEOUT;
    else
        return OK;
else
    return OK;

fortuna commented 1 month ago

@wkrp your interpretation of the tree is correct, but the conclusion for MTS is not 100% correct. There are slight variations in some cases.

I was able to create an optimized decision tree that is a lot easier to understand. It clarifies the classification for MTS:

decision_tree_comparison_optimized_multiline_no_prefix

Of note, the EPHEMERAL_PORT, which corresponds to "Hetzner Online | 58987", fails for HTTPS.

You are right that if we use port 80 or 443 (I only had Digital Ocean with those), we can't find a strategy that works for both http and https. But we can find strategies for each of them individually.

fortuna commented 1 month ago

I keep iterating on this visualization. I found this new tree the best to figure out how to fully evade the block:

decision_tree (2)

Now you just need to find a path to the green nodes.

How to bypass the blocking: With that, it's easy to see that using Digital Ocean and a high port (we used 5555), fully bypasses blocking on Tele2, Bee Line and MTS, and by adding the POST%20 or %16%03%01%00%C2%A8%01%01 prefixes you can also bypass the blocking on MegaFon.