net4people / bbs

Forum for discussing Internet censorship circumvention
3.2k stars 75 forks source link

An Eye for an Eye: Countering GFW Active Probing with a Whitelist Firewall #246

Open UjuiUjuMandan opened 1 year ago

UjuiUjuMandan commented 1 year ago

The article is translated from https://wallesspku.com/misc/2020/04/13/eye4eye.html (archived) (in Chinese)


Abstract

GFW will identify the server running Shadowsocks through active probing. We have verified this conclusion again through experiments, and we have provided a simple and effective countermeasure: setting a whitelist firewall on the server side. Experiments have proved that our strategy effectively prolongs the survival time of the server, and it will not affect the user experience.

Introduction

The working principle of GFW is very complicated, but it can be divided into two steps in general, passive sniffing and active probing. The former is to capture data packets from the wall to analyze the traffic characteristics, so as to identify the traffic that bypasses the wall. The latter is to actively flip The firewall server sends detection data packets to capture features and identify the circumvention server through active detection. The current popular circumvention software (such as Shadowsocks, V2Ray, etc.) mainly defends against passive detection, and cannot defend against active detection. Some Circumvention software (such as ShadowsocksR) has designed a module against replay attacks (such as using nonce), but in practice, the server running ShadowsocksR will also be blocked.

We guess that the working strategy of GFW is: use passive detection to screen servers, and then use the result of active detection as the basis for banning. In this way, as long as we can resist the active detection of the wall, the server can survive. We have no active detection method for the wall Design a defense strategy, but use the source IP address as the basis, drop all inbound data packets, and only keep the user's data packets, that is, the whitelist firewall strategy. Compared with the server that does not use this strategy, the whitelist strategy allows the server to successfully escape GFW ban. This strategy is easy to deploy, applicable to any server and circumvention protocol (in the experiment, we tried SS, SSR and MTProto), and through clever design, it can not increase the burden on users.

Whitelist

The whitelist policy means that we only allow the server to receive inbound traffic from a specific IP source through the data filtering firewall, while rejecting all other traffic. Specifically, we have setup two servers, the main server and the sub-node: the main server is responsible for Contact all sub-nodes, and the sub-nodes are responsible for providing users with over-the-wall services. Users record their IP through the master node before overcoming the wall, and the master node will broadcast the IP to all sub-nodes, thereby adding this specific IP to the whitelist In this way, we can successfully avoid the detection of the firewall.

But this operation will bring unnecessary burden to the user. The user's network environment may be complex; every redial of broadband, transition from WiFi environment to cellular network, or shuttle between different base stations of cellular network may cause IP changes, so it cannot be recognized by the server. In order to reduce the burden on users, we bind the IP registration function with the subscription update function of SSR. Every time a user requests a subscription list from the SSR server, we will extract the user’s IP, and then Broadcast to all child nodes. Such a strategy greatly reduces the user's operational burden. Simply discarding all inbound data packets is too rough. In fact, we can have more flexible firewall policies. After research, we found that all the servers used by GFW for active detection are Linux servers, and the operating system used by many users is Windows.

The difference between the data packets sent by Windows and Linux is that the initial TTL of Windows data packets is 128, while that of Linux is 64. So a simple idea is that we can unconditionally trust data packets with TTL greater than 64 (considering The distance between the user and the server is unlikely to be more than 64 hops). This way Windows users are not affected by the firewall. In practice, we also found that opening port 22 and accepting ICMP packets will not cause the server to be blocked. In order to avoid disconnection caused by opening the firewall, we can consider opening the ICMP protocol and port 22.

Result

Server Survival Challenge

We have deployed firewalls on all the servers in the SSR airport we operate, and updated the whitelist of the firewalls using the above-mentioned policies. We conducted an 18-day experiment. In the experiment, all servers were not blocked (except for an accidental removal of the firewall event). These servers work on the 163 line, CN2 GT line and CN2 GIA line. But in order to protect our own safety, we cannot announce the specific scale of the experiment. For comparison, we tried to remove the firewalls of 4 servers. In these experiments, these servers were quickly blocked within a few minutes to an hour. This further proves that the whitelist firewall is indeed the main factor to protect the servers. In addition to SSR, we also tested the MTProto protocol. MTProto is a protocol that has long been proven to be characterized. Experiments have proved that such a strategy can also protect MTProto. In the experiment, we deployed 4 servers, and the two servers with protection have been Surviving so far, the two servers without a firewall were blocked in the 20th minute and the first hour respectively. This experiment once again confirms that active detection is a necessary condition for a server to be blocked by a wall, even though the protocol has already been identified by the signature.

Analysis of probe packets

Regarding the analysis of detection data packets, this article already has a relatively thorough analysis. Here we confirm some of its conclusions again, and have a small number of new discoveries. We mainly have the following findings:

  1. The source address of the GFW probing server is in the China Telecom and China Unicom network, mainly the China Unicom backbone network.
  2. Probe packets all have a TTL less than 64.
  3. As long as there are Shadowsocks traffic records, GFW will probe all possible firewall protocols, including ancient protocols that are no longer applicable.
  4. For ports with no traffic records, even when running Shadowsocks, GFW will not probe, that is, it will not exhaust all kinds of ports.
  5. GFW will probe the server with many addresses, but only one port per address.
  6. GFW probes the Shadowsocks port about once every 10 minutes. The probability of detection is roughly proportional to the traffic on this port, but there are also some ports that are never probed - we don't know why.
  7. Each probe will last for several minutes, after which GFW will not initiate the same probe.

Configurations

The tool we use is iptables. We assume that the user's IP is 1.2.3.4, and the specific configuration scheme is

 iptables -A INPUT -m ttl --ttl-gt 80 -j ACCEPT
 iptables -A INPUT -p icmp -j ACCEPT
 iptables -A INPUT -p tcp --dport 22 -j ACCEPT
 iptables -A INPUT -i lo -j ACCEPT
 iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
 iptables -A INPUT -s 1.2.3.4 -j ACCEPT
 iptables -P INPUT DROP
UjuiUjuMandan commented 1 year ago

With this strategy, even a plain SOCKS5 proxy server won't be blocked for at least 3 weeks, maybe further, as I'm still testing.

The operator of WallessPKU (google it to find its current website) has deployed such method for over 2 years, I once heard they are providing Shadowsocks as well as plain HTTP proxy servers for its users. If that's true, it means GFW won't even block very characteristic HTTP proxies, until probing them to confirm.

sh4run commented 1 year ago

Based on my practice, I am afraid this strategy alone doesn't work anymore.