shadowsocks / shadowsocks-android

A shadowsocks client for Android
Other
35.11k stars 11.58k forks source link

Considering alternative coroutine sslocal impls #2452

Closed Mygod closed 4 years ago

Mygod commented 4 years ago

Moving discussion from https://github.com/shadowsocks/shadowsocks-org/issues/154.

We are trying to make ACL handling of shadowsocks-android less hacky (for example https://github.com/shadowsocks/shadowsocks-libev/issues/2627), however, things seem challenging.

  1. While shadowsocks-libev more or less takes the approach of doing async I/O via libev, libev requires you to manually keep track of your execution point to resume your connection processing. This makes extending functionalities error-prone and overall the code is hard to read. In fact, there are still traces of synchronous I/O in shadowsocks-libev (e.g. get_sockaddr), which could block the entire thread if network conditions are bad.
  2. Modern programming language more or less solves this problem with coroutines, examples include golang/rust/c++20/Kotlin. However, go-shadowsocks2 (missed opportunity for go2shadowsocks btw) and shadowsocks-rust do not have support for ACL files yet. As we use really large ACL files, we need to integrate re2 or look for a regex library with similar performance. (golang stdlib does not suffice: https://github.com/golang/go/issues/11646)
  3. Implementing everything in JVM is a bad idea since every byte array is going to be an object in GC heap, and there are no good Kotlin async I/O libraries either. It seems like there are similar issues in golang (objects escaping to heap) but I am not an expert in golang.
  4. Doing PAC is going to be even more expensive than ACL since PAC is actually Javascript.

Thoughts are appreciated.

CC @madeye @riobard @zonyitoo

riobard commented 4 years ago

For point 2, Go's RE engine is pretty robust.

For point 3, Go's GC is really good now and it's relatively easy to write memory-efficient code. But compared to languages without GC and runtime (Rust/C/C++) there's still a memory penalty. However recent discussion on HN makes me more confident in Go compared to Rust.

@Mygod I like the go2shadowsocks name! 😂

Mygod commented 4 years ago

I don't know that article but it seems ancient (it even predates re2). We are talking about 100KB of regex pattern matching domain names so I am not sure if it is as performant as you claimed.

Some benchmarks would be nice...

riobard commented 4 years ago

The point of Cox's article is that Go uses finite automata-based regex engine so the performance should be good enough. Just checked re2 is similar so I guess you should expect similar performance.

Mygod commented 4 years ago

As mentioned in https://github.com/golang/go/issues/11646, Go uses NFA so its matching time would be at least linear in pattern length for our use case. We want sublinear matching time which could be achieved via DFA, but DFA is harder to implement.

riobard commented 4 years ago

Ah good to know there's still room for improvement! 😄

riobard commented 4 years ago

Out of curiosity, what do you do with ACL that makes it so large?

Mygod commented 4 years ago

GFW lists and China lists.

While in theory DFA can take exponential size, almost all (except <5 maybe) of the rules are simply matching domain suffixes so a good DFA implementation should take almost linear size and only exponential in the number of exceptions. Of course, if no such good implementation exists, maybe another way is to handle those domain suffix matching differently using a DFA-esque algorithm (say, using Aho-Corasick with an alphabet over all the possible parts of the domain) and handle the rest using a regexp.

riobard commented 4 years ago

If it’s almost suffix marching, wouldn’t trie-based data structure more efficient (both space and time)?

Mygod commented 4 years ago

Yeah. Trie is another DFA-esque algorithm that I was suggesting previously. :)

riobard commented 4 years ago

Ah sorry I missed that part. Anyway, in this case I don’t think regex is the best solution.

madeye commented 4 years ago

I think we all agree that shadowsocks-rust is the best choice here. it looks not much work to implement ACL in shadowsocks-rust, at least for ss-local only.

@zonyitoo What do you think?

Mygod commented 4 years ago

Actually I think after discussion with @riobard I think the only drawback of using golang is GC. I am really unfamiliar with rust so I really don't know which one to use now.

madeye commented 4 years ago

Last time we faced similar issues in overture due to the memory leak in golang's regex routine: https://github.com/shadowsocks/shadowsocks-android/issues/1639

Not sure if this is still a problem now.

madeye commented 4 years ago

@riobard What about implementing the ACL in ss-go first, and let's compare the performance between implementations.

zonyitoo commented 4 years ago

I think we all agree that shadowsocks-rust is the best choice here. it looks not much work to implement ACL in shadowsocks-rust, at least for ss-local only.

@zonyitoo What do you think?

Sure. But I haven't looked deep into the ACL implementation in libev yet.

madeye commented 4 years ago

@zonyitoo The ACL function in ss-libev is very naive. I think you can build new ACL rules to support more features, like forwarding different domains to different upstreams.

Some example ACL for shadowsocks-libev can be found here: https://github.com/shadowsocks/shadowsocks-libev/blob/master/acl/

madeye commented 4 years ago

Here're the missing features in ss-go and ss-rust for android integration:

  1. protect() call back. This is used to set flags on the raw socket to bypass Android system's NAT rules. It's straightforward in go with official APIs. For rust, we can achieve this with AsRawFd()
  2. Traffic update call back. This is straightforward. Rust implementation should already have this for ss-manager.
  3. Local reverse DNS lookup. This is used to lookup the domain name associate with an IP address for ACL rules. Given we already have a Kotlin based local resolver, this lookup can also be achieved with a callback to Android application.
  4. ACL logic. We can take this opportunity to refine this part. For example, given ss-rust already supports multiple upstreams, we can add new rules to support fine-grained policy routing.

BTW, the above logic can also be shared with an iOS implementation.

riobard commented 4 years ago

I have no idea how the current ACL works and what features it supports, but judging from https://github.com/shadowsocks/shadowsocks-libev/blob/master/acl/ there are two major categories: 1) CIDR-matching, which can be efficiently calculated using a radix tree, and 2) domain-matching, as discussed before, is also mostly suffix matching.

Why did we end up with regex in the first place? 😂

Mygod commented 4 years ago

There is also an ACL implementation in this repo written in Kotlin and C++ (re2), which might be more readable... I would love to write them myself (and learn golang/rust) but unfortunately time is a luxury I don't have, so let me post a write-up explaining things sometime soon.

Mygod commented 4 years ago

ACL

Let's first start with ACL. ACL is used to decide whether a given IPv4/IPv6/hostname should be bypassed. Basically there are two parts of ACL, one is hostname matching and the other is IP matching. The matching behavior (for a standard socks5 request) is described by the following algorithm:

  1. On input IP, set ips = [input] and skip to 4;
  2. If hostname matches some rule in proxy list, the request is proxied; if hostname matches some rule in bypass list, the request is bypassed; if it matches both list, the behavior is undefined.
  3. Hostname matching is inconclusive and we resolve the hostname locally and let ips be the resolving results;
  4. Check the default behavior for unmatched IPs and match IPs against the other list. For example, if the default behavior is to proxy the request, bypass the request if any IP in ips matches the bypass IP list, otherwise proxy the request.

For hostname matching, we can concatenate all regexps using |. Of course, the speedup mentioned above (https://github.com/shadowsocks/shadowsocks-android/issues/2452#issuecomment-589255846) could be employed. For IP matching, we can either use a exhaustive search (shadowsocks-libev maybe, last time I checked), a radix tree, or a binary search (shadowsocks-android).

Why ACL

Here is my assumed reasoning for using ACL. With Chrome, you can configure proxy with extensive rulesets using extensions. Android does not have a good interface for socks5 proxy except for a global VpnService (or a transproxy with root), so a socksifier is employed. As mentioned above, PAC is Javascript and we want to avoid having to compile/interpret hundreds of KBs of Javascript on a phone, therefore a reduced version (ACL) is employed. The socksifying process is illustrated like this:

client applications <-> linux tun device <-> linux kernel <-> linux tun fd <-> tun2socks <-> sslocal

Android will set up iptables routes so that all traffic will go to the tun device by default. To bind shadowsocks traffic to the underlying network (there is a DefaultNetworkListener that decides the underlying network to pass shadowsocks traffic through), pass the socket fd through an ancillary message to the socket at ./protect_path and wait for response. Similarly, traffic update is done through ./stat_path. (this is besides the point of doing ACL but I am mentioning these wrt https://github.com/shadowsocks/shadowsocks-android/issues/2452#issuecomment-589484390 by @madeye)

Complications of doing ACL with socksified VPN

  1. Linux does not support per-interface DNS very well and Android needs to accommodate dynamic network environment. In our context, we need to do local DNS resolving in step 3 through Android's DNS resolving API. Part of that logic is currently in LocalDnsServer but I am thinking that we can provide a local socket (maybe ./local_dns_path) to provide sslocal this interface.
  2. Different from browser environment, all hostnames are resolved by client before connecting to IPs directly, instead of connecting to hostnames. This is because the clients are oblivious to how the tun device actually operate. To make ACL matching work correctly under this case, we need to resolve DNS based on ACL rules, and also remember our DNS requests and responses for future reverse lookups.

More rigorously, we want to have a local DNS relay doing the following: (currently implemented by LocalDnsServer)

  1. On input DNS packet, attempt to parse it as an A/AAAA request. Forward the request to remote on failure.
  2. We assume both ACL matching and the remote DNS request will be slow and we will send the request asynchronously in this step, and cancel it later if it is not needed.
  3. Match the hostname as in ACL step 2. If hostname matches something, resolve the DNS request locally/remotely as the outcome and skip to step 5.
  4. If nothing is matched, resolve hostname locally and attempt to match IPs as in ACL step 4 and decide the outcome.
  5. Look up each IP as in ACL step 4. If the current outcome does not match each IP's matched block, add (IP -> outcome) to the exceptionIpPool.

With the DNS server set up, we can do the ACL matching under VPN/transproxy mode:

  1. On input IP, check that if IP is in exceptionIpPool. If so, use the outcome there.
  2. Otherwise, use the old ACL matching behavior.

Outlook

I think with this change we will be able to do a lot more in the future, e.g. policy routing (mentioned above by @madeye), #2087, #2301, etc.

riobard commented 4 years ago

@Mygod That's… quite some work 😂

Mygod commented 4 years ago

Yeah the complicated part is local DNS resolving. If we are only doing ACL in the new impl then we have not ventured far from keep using libev impl. :)

zonyitoo commented 4 years ago

I have just finished reading acl.c of libev. It is quite a simple and clean implementation. But where is the regex matching thing? I haven't seen anything in this project.

zonyitoo commented 4 years ago

BTW, supporting ACL like libev is quite a simple job. I will try to implement that in the following weekends.

zonyitoo commented 4 years ago

I have just finished reading acl.c of libev. It is quite a simple and clean implementation. But where is the regex matching thing? I haven't seen anything in this project.

Ah .. I see. https://github.com/shadowsocks/shadowsocks-libev/blob/master/acl/gfwlist.acl

They are in the rule.c.

zonyitoo commented 4 years ago

Privoxy uses very similar rules: https://www.privoxy.org/user-manual/actions-file.html#HOST-PATTERN . I am curious about what they did for improving performance.

madeye commented 4 years ago

@zonyitoo A best practice is concatenating rules together like rule1|rule2|rule3, then the DFA engine in RE2/regex will do the optimization for us.

Here are some sample codes in shadowsocks-android:

  1. https://github.com/shadowsocks/shadowsocks-android/blob/master/core/src/main/jni/jni-helper.cpp#L41
  2. https://github.com/shadowsocks/shadowsocks-android/blob/master/core/src/main/jni/jni-helper.cpp#L93

Some search shows the regex engine in rust should be even faster than RE2: https://github.com/rust-lang/regex/blob/master/bench/log/05/re2-vs-rust

zonyitoo commented 4 years ago

protect() call back. This is used to set flags on the raw socket to bypass Android system's NAT rules.

Does Android have a C API for that? I can only find a Java API.

It's straightforward in go with official APIs.

Hmm? How to do that?

madeye commented 4 years ago

Yes, only Java API. It means we need a callback through JNI. An example can be found here: https://github.com/madeye/BaoLianDeng/blob/master/app/src/main/java/io/github/baoliandeng/core/LocalVpnService.java#L215

In golang, we have a callback in Dial interface: https://golang.org/src/net/dial.go#L97

zonyitoo commented 4 years ago

shadowsocks-rust have just finished supporting ACL in TCP relays.

Mygod commented 4 years ago

Thanks guys!

Re protect: As mentioned above, shadowsocks-android uses ./protect_path. JVM implementation: https://github.com/shadowsocks/shadowsocks-android/blob/9f15ab52d06cdaa8d5446c6eb200737f8c2a7050/core/src/main/java/com/github/shadowsocks/bg/VpnService.kt#L65-L92

Re 2-DNS issue: I think all the DNS probes are necessary? (unless I am missing something) In addition to my explanations above, let me recall the part that involves DNS in VPN mode:

  1. In SOCKS5, if we receive a HOSTNAME request, we resolve it once locally and apply ACL accordingly.
  2. In a normal VPN connection, we first receive a DNS request from client, which results in up to two DNS resolutions (once locally, once remotely). After that, the client might issue a request directly to the resolved IP, so sslocal does an IP reverse lookup in exceptionIpPool and route the traffic accordingly. No DNS requests are done in this step.

Re-IP merging: I think it is safe to assume the input does not have duplicated blocks unless you are using a binary search, since there the algorithm will not work correctly if there are duplicates.

Mygod commented 4 years ago

Also none of these are set in stone so feel free to propose changes when appropriate.

madeye commented 4 years ago

I just got shadowsocks-rust successfully integrated into shadowsocks-android.

For the next step, @zonyitoo can help implement those two RPCs in shadowsocks-rust.

  1. https://github.com/shadowsocks/shadowsocks-libev/blob/master/src/android.c#L49
  2. https://github.com/shadowsocks/shadowsocks-libev/blob/master/src/android.c#L98

It's a long term project, so take your time, no rush.

zonyitoo commented 4 years ago

Ok. I can only work on it in weekends.

ohsorry commented 4 years ago

I prefer separate filtering modules. An in-memory index is worth considering, basically a balanced tree.

Mygod commented 4 years ago

Yeah we can definitely put ACL part directly in this repo if it is doable.

ohsorry commented 4 years ago

https://github.com/shadowsocks/shadowsocks-libev/tree/master/acl Can we merge these into two files? One is called "IP.list" and the other is "domain.list". Each item in the list can be prioritized. If a conflict occurs, let the user choose whether to bypass or connect directly. IP matching may use a segment tree/B tree, domain name matching uses a regular binary search tree.

Mygod commented 4 years ago

@madeye https://github.com/shadowsocks/shadowsocks-android/issues/2452#issuecomment-589503820 is what I was talking about in PR. I think this implementation is much easier and does not involve RPC for reverse lookup. Let me know if you want to change it.

Mygod commented 4 years ago

An interesting write-up at Fuchsia: https://fuchsia.googlesource.com/fuchsia/+/refs/heads/master/docs/project/policy/programming_languages.md

madeye commented 4 years ago

@Mygod After several weeks of developing with Rust, I'm quite sure that it's the best language for system programming.

We're also discussing the adoption of Rust internally. However, one concern is that there's no ISO 26262 standard compiler for Rust so far. It means no safe certificate for the production software written by Rust.

Mygod commented 4 years ago

I see. I guess that's why you are making time to contribute to the rust repo. 🤣

ohsorry commented 4 years ago

Consider my proposal, advanced routing options can be placed in the "Advanced" menu, or even make the ACL editable. Most SS users only need an easy-to-use configuration: should I proxy this IP / URL?