SIP001: Header obfuscating

madeye commented 7 years ago

Shadowsocks Improvement Proposal 001

SIP001 - Allow header obfuscating to cheat on QoS.

Recently, QoS of some ISPs becomes unreasonable. A cheap way to solve this problem is header obfuscating, which inserts some fake headers before shadowsocks handshake packets.

For example, before a shadowsocks request, we insert this HTTP GET header:

    POST / HTTP/1.1\r\n
    Host: www.baidu.com:8388\r\n
    User-Agent: curl/7.45.1\r\n
    Accept: */*\r\n
    Content-Type: application/octet-stream\r\n
    Content-Length: 176\r\n
    \r\n

Similarly, we insert this HTTP header before a shadowsocks response.

    HTTP/1.1 200 OK\r\n
    Server: nginx/1.0.2\r\n
    Date: Tue, 13 Dec 2016 13:25:12 GMT\r\n
    Content-Type: application/octet-stream\r\n
    Content-Length: 176\r\n
    Connection: keep-alive\r\n
    Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform\r\n
    Pragma: no-cache\r\n
    \r\n

With this SIP, we may cheat on most of QoS mechanisms, avoiding QoS related packets dropping or bandwidth limit.

A demonstration can be found here: https://github.com/shadowsocks/shadowsocks-libev/tree/obfs

Any suggestion is welcome.

Mygod commented 7 years ago

This feature is optional and configurable right?
Why does it use \r\n instead of \n?
May I suggest to use POST and add Content-Length to the request since we need to post data to the server?
Content-Type: text/html and Content-Encoding: gzip doesn't match the content the server returns which would be suspicious. How about application/octet-stream and remove Content-Encoding (which means anything is valid)?

madeye commented 7 years ago

Yes, it would introduce additional features of the traffic. We may refine the implementation to make it closer to real HTTP traffic.
It should be a problem. For now, we should warn the user about the risk and make this feature disabled by default.
Do you mean we should fake the header like a CDN header?

madeye commented 7 years ago

@Mygod

Right, optional and configurable.
From RFC, it seems to be \r\n. Correct me if I'm wrong.

   HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all
   protocol elements except the entity-body (see appendix 19.3 for
   tolerant applications). The end-of-line marker within an entity-body
   is defined by its associated media type, as described in section 3.7.

       CRLF           = CR LF

Yes, it looks a good idea.
Ditto.

nekolab commented 7 years ago

I suggest let user define request and response header by themselves, not use a fixed template.

Mygod commented 7 years ago

Yes. It should be as flexible as possible but we should supply a good default value/template.

nekolab commented 7 years ago

Fine, another question is this is a connection-level header or a conversation-level header.

A connection-level header only appears when TCP connection established, after that it won't be sent any more. A conversation-level header will appears everywhere in a TCP stream, each time invoke send will append fake header to the stream.

Neither POST nor GET method in HTTP can represent a connection-level header in semantics, because after a send-recv round, ordinary HTTP client will close the TCP connection or hold it for another HTTP connection (with another header), but TCP connection will still send and receive data.

I'm not familiar with the libev version of SS, after a quick look I believe this implementation use the connection-level header, correct me if I'm wrong.

The conversation-level header may looks more like an ordinary HTTP client works on POST method and multiplexing the connection, but will it decrease the performance, add the complexity to find and remove the fake header or add more(more more) characteristic to the protocol?

v3aqb commented 7 years ago

how about use a websocket header?

Mygod commented 7 years ago

Hmm. Maybe we can support both HTTP mode and WebSocket mode?

madeye commented 7 years ago

Websocket looks a great idea. It helps to avoid conversation headers mentioned by @nekolab.

I'm not a big fan of fully customized headers, which may introduce illegal usage of this feature.

nekolab commented 7 years ago

We may run some tests to confirm whether the websocket header can cheat QoS successfully or not. I'm not pretty sure since it's a new protocol and may be ignored by QoS, if it works, I vote yes for it.

ayanamist commented 7 years ago

I dont think WebSocket header will cheat QoS since the cheat proved valid seems to be very bad implemented.

SSR with simple_http has been successfully proved to be valid on cheating QoS under Hangzhou Telecom. SSR with simple_http are using GET method with request body which is definitely a illegal formed http request.

Do you plan to move some data like IV from request body to request path like SSR does? This can make request url different from request to request which i think will increase detect difficulty.

ayanamist commented 7 years ago

@wongsyrone I dont understand what you said. If a request is invalid, it can't bypass shadowsocks existent verification mechnism, so where a correct response comes from? In fact i think it will decrease the risk of exposing server side, since it can emulate like a normal http server.

madeye commented 7 years ago

Update the websocket obfuscating via https://github.com/shadowsocks/shadowsocks-libev/commit/61769031b55b2f91803bb1adc6577d943ecf5e8c

Request:

    GET / HTTP/1.1\r\n
    Host: www.baidu.com:8388\r\n
    User-Agent: curl/7.18.1\r\n
    Upgrade: websocket\r\n
    Connection: Upgrade\r\n
    Sec-WebSocket-Key: XVOfcm44bdPb0+xNrmf4tg==\r\n
    \r\n

Response:

    HTTP/1.1 101 Switching Protocols\r\n
    Server: nginx/1.2.2\r\n
    Date: Wed, 14 Dec 2016 13:42:07 GMT\r\n
    Upgrade: websocket\r\n
    Connection: Upgrade\r\n
    Sec-WebSocket-Accept: byeMGrcAr+bKUtt+i2Thaw==\r\n
    \r\n

Basically, it's still a HTTP GET obfuscating. However, websocket protocol lets the whole traffic stream look more normal.

Mygod commented 7 years ago

illegal usage of this feature.

Hmm I thought that was the point of this feature.

simonsmh commented 7 years ago

@wongsyrone That's why it should be disabled by default if necessary. @mygod In another project shadowsocksr could ban these ip/domain for illegal usage at the server side. That's not the major issue.

madeye commented 7 years ago

Actually, I don't think we need to worry about adding new features.

The soul of shadowsocks is to solve a stupid problem (you know what I mean) with as less effort as possible. If any small change works well, we just add it. If not, we drop it.

As an optional protocol extension, even if this proposal introduces new problems, we can continue to refine it or just drop it.

As the next step, I suggest to do more tests in real environments and let's see what will happen.

v2ray commented 7 years ago

Assuming the proposal applies on TCP connections only. This feature is equivalent to a customized HTTP proxy (say ShadowHTTP).

The only difference is that ShadowHTTP only tranfers encrypted content, when normal HTTP proxy allows both plain and encrypted payload. The HTTP method may be different but configurable (discussed above). Going further, ShadowHTTP may have ability to proxy out (or deny) invalid request, in order to avoid detection/probing. This is one step further to be a normal HTTP proxy.

A HTTP proxy is fine, but it doesn't fit the need of a socks proxy. If your end goal is to cover UDP or provide other type of obfuscation, I would suggest the design to be more fundamental and extensible, to fit potential grow in the future.

ayanamist commented 7 years ago

@v2ray No, it is not a HTTP proxy, but a SOCKS proxy obfuscated as a HTTP proxy which definitely fits the need of a SOCKS proxy.

madeye commented 7 years ago

@v2ray The proposal here is header obfuscation and the goal is to find a cheap way to cheat on QoS. In other words, it just does some simple obfuscation, no plan to implement full HTTP protocol.

pexcn commented 7 years ago

Good idea.

librehat commented 7 years ago

Will it be separately optionally enabled in client-side and server-side? (i.e., as a server, I received obfuscated request, am I allowed to respond with non-obfuscated response?) Or it would be similar to OTA, an obfuscated request will also make sure the response is also obfuscated.

v3aqb commented 7 years ago

with URI like this?

ss://method:password@hostname:port/?obfs=http[&hostname=www.baidu.com]

or

ss://method:password@hostname:port/?obfs=http[&header=BASE64-ENCODED-HEADER-DATA]

madeye commented 7 years ago

@librehat Right, it's totally optional. Both client and server should enable the same obfuscation. On the server side, when the obfuscation is enabled, it still can handle normal protocol without obfuscating. So,

-------------------------------------------
| Client-Obfs |   Server-Obfs  |  Working |
| Yes         |   Yes          |  Yes     |
| Yes         |   No           |  No      |
| No          |   Yes          |  Yes     |
| No          |   No           |  Yes     |
-------------------------------------------

@v3aqb The first one looks better. As the hostname should be ASCII, no need to do base64 encoding.

librehat commented 7 years ago

@madeye Actually I don't think server need to be able to disable the obfs if it supports it since it should be fully back-compatible. We don't have to add one more config in server side each time a new feature is proposed (but it can also be up to each implementation)

madeye commented 7 years ago

@librehat I think there are two reasons why we need to provide an option on the server side:

Prevent potential security issues. If any security issue is found in the future, users can easily disable obfuscating support on their servers. Or if a user doesn't want to take risk to enable obfuscating, he can still keep updating to the latest software with obfuscating disabled by default.
Support different kinds of obfuscating. Currently, we only have HTTP obfuscating, but someday we may have more. So, it's necessary to provide an option for switching between different obfuscating implementations.

ghost commented 7 years ago

Is there a reproducible test to show the problem, that is ISP will favor an HTTP request over a shadowsocks TCP request, in the first place? Because I am not observing it.

ghost commented 7 years ago

@nekolab I don't believe HTTP spec 1.1 denied the possibility for multiplexing, in other words a strict request / response semantic is only conventional. A single obfuscation at the start of the TCP stream should be sufficient.

madeye commented 7 years ago

@nfjinjing If you have a link with China Telecom, you may try experiments around 9:00PM to 11:00PM everyday. Actually, according to some internal sources of Cisco, they have deployed similar QoS mechanism on ASR 1000 series for China Telecom years ago.

ghost commented 7 years ago

@madeye That's very interesting. Unfortunately because of a different ISP, I can't verify it myself.

I tried the obfs branch at 3d71c2, how do I know if obfuscation is turned on? There seems to be no options to enable it, and I didn't find any HTTP headers with tcpdump.

madeye commented 7 years ago

Try --obfs http --obfs-arg www.baidu.com on the client and --obfs http on the server.

ghost commented 7 years ago

OK, some preliminary result of fast.com, measured between China Unicom and aws ec2 nano tokyo region, with and without obfs is 6Mbit and 8Mbit respectively. Note how the setup with obfs is slower? I only ran each test twice, so it could be insignificant.

simonsmh commented 7 years ago

ISP China Telecom at 4:37 pm without obfs 2.9 Mbps with obfs 8.7 Mbps Cool!

madeye commented 7 years ago

@yjqiang Both of these two projects share some similar ideas. Nothing is reinvented. They are just different implementations.

madeye commented 7 years ago

To end the comments like those from @yjqiang, I wrote a blog post to explain the relationship between shadowsocks and its forks.

You can find it here: https://maxlv.net/open-source-and-forking/

manjuprajna commented 7 years ago

@madeye what is proper "posture" to nourish "obfs"? I modified shadowsocks-libev.service file like this: ExecStart=/usr/bin/ss-server -a $USER -c $CONFFILE $DAEMON_ARGS --obfs http and modified my openwrt init.d file like this: service_start /usr/bin/ss-redir -c $CONFIG -b 0.0.0.0 -u --obfs http --obfs-host www.baidu.com then restart both service side and client side, seems working, no error message, but no obvious improvement, maybe because I am using china unicom.

so I have serveral questions: 1: what's the difference between --obfs http and --obfs tls? 2: do I have to enable obfs for ss-tunnel? because I run ss-tunnel --help, it also has --obfs args, if not, I think you better delete this misunderstanding args for ss-tunnel. 3: will obfs be added into json file option? 4:--obfs-host is a must or not? just now I did not input --obfs-host, but still working. if we must input a host name for --obfs-host, which host name is better? baidu? since isp will never qos baidu?

thank you for your amazing work!

goodbest commented 7 years ago

Although there is no evidence, the port-based QoS by ISPs should also be taken into consideration in the obfs performance comparison.

So I think it's better to compare obfs performances with various ports, or at least provide your remote server port in the report.

For example, Port 80 and/or Port 443 may have higher priorities than other ports (e.g. 8388) in port-based QoS policies.

madeye commented 7 years ago

@manjuprajna

They are just different obfuscating implementations.
ss-tunnel also supports obfuscating. It's up to you if enable or not.
It's already supported in config file, with option obfs and obfs_host,
It's optional and you can try any hostname you like.

madeye commented 7 years ago

@goodbest Good point! If you only observe port based QoS, it's not necessary to enable this feature. And it's true that many ISPs only perform port based QoS.

ghost commented 7 years ago

I ran some more tests on China Unicom, and the result is intriguing.

Local: 50Mbps ADSL Server: AWS tokyo, nano, kernel 4.9 Test: fast.com, 4:30 pm

no obfs: 9.1 Mbps http: 7.6 Mbps tls: 8.5 Mbps

I was inclining to conclude that China Unicom doesn't use QoS, or at least not to the point that is noticeable on normal usage.

The following is a bit off topic, but still relevant in terms of QoS. I did another test with the new tcp-bbr algorithm, the result is so different:

no obfs: 36 Mbps http: 44 Mbps tls: 32 Mbps

It still confirms that there's not much difference with and without header obfuscation, but as to whether there is QoS on China Unicom, I think the result speaks for itself.

pexcn commented 7 years ago

@nfjinjing fast.com 有时候不太准确，用 http://www.speedtest.net 比较准确

ghost commented 7 years ago

Thanks @pexcn. Here's the result of beta.speedtest.net, with tcp-bbr:

no obfs: 71       74       71.6
http:    77.92    82.87    78.34
tls:     79.23    77.72    84.03

There is significant improvement with either http or tls, great!

Edit: Just for completeness, here's result without tcp-bbr:

no obfs: 8.49     12.19    10.17
http:    10.69    8.48     10.26
tls:     9.54     7.99     5.67

I'm not sure how much header based QoS is in effect here, giving that the bandwidth has been reduced so much.

ualtinok commented 7 years ago

@nfjinjing Do you need tcp-bbr on both end or only server side?

ghost commented 7 years ago

@ualtinok I was conducting the tests with only the server side enabled.

Please note, in general, this is not the thread to discusses bbr.

simonsmh commented 7 years ago

How about support it on ss-android? Users of surge reported that it would have a better connection stability with obfs enabled. Other clients like luci-app-shadowsocks have supported it too.

madeye commented 7 years ago

@simonsmh It's still an experimental feature. Let's have more tests before merging to other projects.

Grayon commented 7 years ago

It sounds like another shadowsocksR see more at ssr-obfs

nicholascw commented 7 years ago

While ShadowsocksR seems to be much more unofficial and project itself seems to be unauthority. Generally shadowsocks itself is much more user-friendly.

nicholascw commented 7 years ago

@xi1024 While the application scenario, or you can say propose, is similar or the same, ways to fulfill the goal could not have so many. ShadowsocksR itself have no patent on obfuscation and obviously would not. Moreover, the shadowsocks repos would not directly copy the ShadowsocksR's code. What would you want to express by saying " without any acknowledge and respect"? By the way the ShadowsocksR itself did not observed the Apache 2.0 License which shadowsocks protocol originally applied, I am now even doubting if the shadowsocksR itself has got any "acknowledge and respect".

And saying creative thoughts I have to say share thoughts between projects is the essence of open source spirit, and more applications is the best respect to the original inventor/applier. It is never "Steal"

Mygod commented 7 years ago

Please stop commenting anything unrelated to technical details of this proposal from now on. If there's anything else you'd like to add, go somewhere else (like open a new issue) to discuss. By the way, may I say that you are here just because somebody is whining and ripping on other people makes you feel like a hero.

madeye commented 7 years ago

Please stop this kind of meaningless debate.

I really appreciate her effort on forking and improving shadowsocks protocol. And I think her project is very cool. ShadowsocksR tried this header obfuscation first and proved that it works really well for some ISPs. That's why we borrow her idea and build our own implementation. You can find more details in the comments above:

https://github.com/shadowsocks/shadowsocks-org/issues/26#issuecomment-269334985

Forking and merging, that's how open source works.

shadowsocks / shadowsocks-org

SIP001: Header obfuscating #26