[One Idea] IP Geolocation Based Filtering

sh4run commented 1 year ago

Probes and replays from GFW are threats a shadowsocks instance has to face every day. This feature introduces a new mechanism to filter out those malicious packets. Tests show GFW uses IP addresses all around China (oversea addresses are spotted as well) as source in those probes and replays. This is an effective method to undermine any blacklist at shadowsocks side. But this also gives a good chance to screen most of them based on IP geolocation if geolocation of the actual client is known for certain. This feature is not applicable in large business use cases where the shadowsocks service is provided to unspecified individuals.

A draft implementation can be found here: https://github.com/sh4run/ss. Tests show it can filter out 95+% probes & replays.

Comments and thoughts are welcome.

database64128 commented 1 year ago

Or just switch to the new Shadowsocks 2022 protocol (#196). Shadowsocks 2022 provides full protection against replay and probes. My implementation also allows you to configure how invalid requests are handled.

dev4u commented 1 year ago

两件事：

@database64128 的ReplyWithGibberish这个设置赞，如果能自定义返回的头部数据就更好了。
楼主说的那个，我做过试验，虽然可能不严谨，但可以抛砖引玉分享一下：
- 先说结论：刚部署是能看到效果，但一段时间后就失效了。我观察到的现象是：我发起的请求，被华东、华中、华南随机中转出去，收到的客户端地址，也是随机的。所以就算我把所在省的ip地址做了白名单，还是没达到预期效果。
- 建议：除非有真实的公网ip(公网ip也有假的)，不然没办法预知连接的客户端ip是什么。

database64128 commented 1 year ago

@dev4u What kind of custom header do you have in mind? Should subsequent reply still be random gibberish? Maybe instead of adding an option to allow ReplyWithGibberish to have a custom header, we add a "fallback" reject policy to forward the connection to another target. What do you think?

sh4run commented 1 year ago

Thanks both for your comments. @database64128, I haven't got a chance to study Shadowsocks2022 yet. I think this IP Geolocation based filtering doesn't contradict to Shadowsocks2022. This feature may enforce the security of many similar software, SS2022 included.

@dev4u, my testbed has been running for 2 weeks. Haven't noticed anything like what you described. Actually I have a doubt about that. Such behavior will definitely compromise any normal service based on IP Geolocation. And I cannot see what's in it for those ISPs, other than bringing themselves troubles. Perhaps the IP Geo info you used are not accurate enough?

Thanks

dev4u commented 1 year ago

@dev4u What kind of custom header do you have in mind? Should subsequent reply still be random gibberish? Maybe instead of adding an option to allow ReplyWithGibberish to have a custom header, we add a "fallback" reject policy to forward the connection to another target. What do you think?

@database64128 我对于抗监%%管的立场是：与其堵不如导。这里说的“导”，有疏导、引导两层意思。

对于监&&管者来说，他们希望知道端口提供什么服务。对于探知的请求，如果只是阻断、返回随机数据，这对他们识别服务，毫无帮助，也不能断了他们的念想而继续探知……如果返回一段特定头部的数据：例如返回voip、rtsp、ldap、ssh…特征头部数据，可以有效降低监管者对端口的好奇、探知。

实现了fallback，可以返回特定的头数据，对于端口服务伪装性，也会有很大的帮助。

dev4u commented 1 year ago

@dev4u, my testbed has been running for 2 weeks. Haven't noticed anything like what you described. Actually I have a doubt about that. Such behavior will definitely compromise any normal service based on IP Geolocation. And I cannot see what's in it for those ISPs, other than bringing themselves troubles. Perhaps the IP Geo info you used are not accurate enough?

Thanks

这可能是幸存者偏差，不过只要用这个方案，肯定会受ip geo准确性影响使用体验。我收集过连接成功的客户端地址，的确有来自华东、中、南地区。而我一直只在其中之一的区域，并没有去过其他区域。

dev4u commented 1 year ago

@sh4run 我见过有人做过一个这样的方案，跟你这个有点像，可以了解、参考一下：

iptables设定只允许白名单连接ss端口。
设定如果接到大尺寸包(mtu)的icmp请求，自动添加客户端到白名单。
如果收到的icmp不是大尺寸的，忽略请求。
ss客户端连接前，先发一个大尺寸包的icmp请求到ss服务器，发了icmp后再建立连接。
……

database64128 commented 1 year ago

@dev4u Counterpoint: The censor can also block the port when the actual traffic does not match probing results. For example, the traffic over a Shadowsocks TCP port looks nothing like TLS. But if you configure your server to fallback to a TLS server, probing would determine the service to be a TLS server. The censor may then decide to block the port on the basis of protocol mismatch.

Nonetheless, fallback might still be useful in some scenarios, and it’s trivial to implement.

sh4run commented 1 year ago

@dev4u 谢谢分享。为了抵抗封锁，大家也是绞尽脑汁，各显神通 :-)

@dev4u Counterpoint: The censor can also block the port when the actual traffic does not match probing results. For example, the traffic over a Shadowsocks TCP port looks nothing like TLS. But if you configure your server to fallback to a TLS server, probing would determine the service to be a TLS server. The censor may then decide to block the port on the basis of protocol mismatch.

Nonetheless, fallback might still be useful in some scenarios, and it’s trivial to implement.

@database64128 @dev4u My gut feeling on this is, when there are too many unknowns, doing nothing is somehow better than doing something.

database64128 commented 1 year ago

when there are too many unknowns, doing nothing is somehow better than doing something.

The current default is quite passive too. I just want to give users more options, just in case you know. And it’s quite a bit of fun to brainstorm, debate, and come up with these options. 😄

dev4u commented 1 year ago

For example, the traffic over a Shadowsocks TCP port looks nothing like TLS. But if you configure your server to fallback to a TLS server, probing would determine the service to be a TLS server. The censor may then decide to block the port on the basis of protocol mismatch.

@database64128 是的，但还有个重要的原因，是有些“布道”者，随便不知哪抄的，提供的方案，让很多一知半解的追随者盲目跟从，这也是要命的。

针对你说的，有人实现了个方案可能可以避免，我也觉得挺有意思。一个叫sh*dow tls，他将请求建立连接的握手包，转至个https站点处理。在完成https握手后，服务端再将后续真实的payload接管回来……

sh4run commented 1 year ago

Some update: my testbed is blocked by GFW today. It seems there is a grand operation today.

I didn't find any leaked probes in the log. So it appears to me how to block or handle the probe/replay is not the key issue. When one receives probe/replay, it means GFW has already put it into its attention list. The probes/replays are actually an alert you are being watched. How to avoid getting those might be the key. Proprietary protocol may be a choice.

database64128 commented 1 year ago

@dev4u Fallback has been implemented in database64128/shadowsocks-go@bfe027008ad42b2c1434ad13a85a7d544cd5b302, and can be enabled by specifying an "unsafeFallbackAddress" in server config.

dev4u commented 1 year ago

@dev4u Fallback has been implemented in database64128/shadowsocks-go@bfe027008ad42b2c1434ad13a85a7d544cd5b302, and can be enabled by specifying an "unsafeFallbackAddress" in server config.

收到！同时我也发现，你紧接着也commit了返回自定义头部数据的功能💪。

对此，我也有一个疑问一个建议：

你的ss服务端，能否与rust的客户端(2022协议)很好兼容工作？
建议如果设置了unsafeFallbackAddress，在启动的时候是否有必要先预请求一下，验确认设置的地址是有效、可用作fallback(更短的超时时间仍正常工作、又或者须指定是loopback地址…避免慢回放)？

功能的雏形是出来了，希望有关注的小伙伴们，别不小心又跑偏了。

database64128 commented 1 year ago

Both shadowsocks-go and shadowsocks-rust adheres to the spec, so they are fully compatible with each other.
Sounds totally unnecessary.
These "unsafe" features are intended for very specific situations. They are "unsafe" because they break the basic promises of the Shadowsocks 2022 protocol, in other words, broken by design. They are not "prototypes" and do not reflect the future development of the protocol.

sh4run commented 1 year ago

Hello guys,

As I found earlier, GFW probes are likely indications that this IP(vps) is being watched. The key point to pass through GFW is not to get its attention instead of how to handle probes properly. Based on that, I added some flavors to shadowsocks stream to something below. I am trying to create a stream w/o any obvious pattern.

client --> Server
--------------------------------------------------------------------------
| Pad-1 | session header(encrypted) | Pad-tail | TLV-1 | TLV-2 ...
--------------------------------------------------------------------------                            

Server --> Client
--------------------------------------------------------------------------
| Pad-2 | Shadowsocks data ...
--------------------------------------------------------------------------

More details can be found here: https://github.com/sh4run/sss#protocol

The test result is pretty good. My testbed has been online for 2 weeks with 40G+ traffic. Compared with SS, much less GFW probes are received. No probes are received in that the last 5 days as long as I don't play youtube 4K video.

Just to share a new idea. Comments and thoughts are welcome.

Thanks

dev4u commented 1 year ago

你在方案中加入了用证书做安全手段，这个我觉得是不错的做法，但仍需要通过时间来验证有没短板。咱们静下来想一下，其实不难发现你提到的问题，在现有的网络环境下，是无解的，该干扰还是会被干扰。从你的测试反馈来看，我觉得是个幸存者偏差。gfw把你的端口，当作ss服务来刺探处理，而你的代码ss请求又不感冒而避开了。加上你的数据样本少得还没引起gfw的关注，gfw还不想对你的流量做学习。

sh4run commented 1 year ago

Hi dev4u,

Thanks a lot for taking a look into this and giving me your feedback.

I guess I didn't give a description clear enough. What SSS does are:

Divides original SS data into multiple pieces, each piece in a random length.
Mixes those SS data pieces into multiple segments of random bytes. In another words, SS data is split into pieces and mixed into a stream of random bytes.
Every TCP connection has its own mixing scheme.
An encrypted header is added to instruct the peer how to decode those SS data.
This encrypted header is padded with random bytes at both head and tail. Therefore an outside observer is not easy to locate the actual content.

By doing so, SSS is trying to hide itself from any pattern match in GFW.

packet length pattern. (The test in SS reveals there is such a thing in GFW)
Byte pattern or content pattern.

SSS is trying to make its frame to be with no obvious pattern (either length or content). So that it would be difficult to use existing pattern match in GFW to detect SSS.

And the test results show:

GFW cannot detect SSS at this moment. (Very few GFW probes were received. No probes were received in the first two days.)
Big traffic may attract GFW attention. (Probes were I play youtube 4k video.)
GFW didn't think this SSS instance was a SS service. The probes received were in length 600+ or even 1300+. This is different in SS.

Of course due to the lack of samples, it is hard to tell how effective this protocol is at this moment. But I think it is too soon to say it is a survivor bias too.

Thanks again.

dev4u commented 1 year ago

咱们换种思路。

你描述那么多优点，其实我想知道，如果你是gfw，你可以用什么手段来干扰你的数据？你的协议其实跟ss已经没有半毛钱关系，现有的策略，为什么会觉得对你的sss仍有效呢？

同时我也交流一些我对gfw的心得看法，可能有点跑题，不中听当我瞎掰。

gfw已经从单一通过数据特征分析，转变为通过立体行为来分析、判定流量的机制。

举个例子，一个服务，周期性产生流量，流量长度一致且不大。这种流量一般是心跳包或者检测包。对于gfw来说，根本不用理会里面是什么数据。譬如一个端口，请求ip多是原生，且分布地域范围又比较集中、固定，单次流量大小集中分布在一个范围，这种会认为是企业内的应用可能性较大。

我只是举例子，不用对号入座。

sh4run commented 1 year ago

Thanks for your reply.

It is true the upstream (client->server) format is changed a lot in SSS. The downstream format is changed too, but much less. Yet from the software perspective, the whole architecture of SS is not changed. All changes are limited to the processing of upstream/downstream traffic: adding and removing scramble code. That's why I call it scrambled Shadowsocks.

The purpose of SSS is to hide itself from the GFW detection that works for SS. So if what works to detect SS doesn't work now, that means SSS achieves its design goal.

The GFW detection methods that I can think of might include: 1) packet length pattern match. This should be easiest method to be enforced as it doesn't require DPI.
2) Byte pattern match. This requires DPI to scan the entire packet content. However, DPI is expensive. To deploy a new DPI rule is difficult and slow. 3) Traffic measurement. This is more complex. I don't know the scale of GFW's capability to deploy such a monitor. I guess it might be limited. 4) Some kind of AI or HI(human intelligence) analysis. This might include the scenarios you mentioned. I basically know nothing about this area.

SSS is able to handle 1) & 2) now. It is possible for SSS to handle 3, but it requires some architecture changes. I will leave 4) out as I don't have a clear picture about that part.

SSS is never designed to be a once-for-all solution. But what I believe is true is, the more difficult and expensive to detect, the bigger chance for this technology to survive.

Thanks again.

shadowsocks / shadowsocks-org

[One Idea] IP Geolocation Based Filtering #206