XTLS / Xray-core

Xray, Penetrates Everything. Also the best v2ray-core, with XTLS support. Fully compatible configuration.
https://t.me/projectXray
Mozilla Public License 2.0
25.88k stars 3.98k forks source link

[Feature suggestion & draft] Transport Session ID #3609

Open PoneyClairDeLune opened 4 months ago

PoneyClairDeLune commented 4 months ago

Extended from #1546. Wish to gather further opinions from others before making sure everything could be ironed out. I'll likely work on this myself once finalized, so hopefully there wouldn't be much complaint about unnecessarily burdening the developers, and could be served as a gag to get some elitists shutting their [redacted] up. "No PR, no say", LMAO.

Design

Transport Session ID (TSID for short) is a customizable randomly-generated token applicable to all outbound transports allowing embedding additional headers (e.g. HTTP/2, WebSocket, HTTPUpgrade and SplitHTTP) (no one knows when Ditzy, the fully-decoupled upgrade of xSSE could be finished, lol). TSIDs are defined globally, then consumed by whichever outbounds when needed. TSIDs could be initialized once upon start, or it could be rotated with the specified time span.

To define a list of TSIDs, the GlobalTransportObject is extended to include tsid. Example of GlobalTransportObject.tsid[<TSIDMapObject>] below.

[{
    "id": "theDeerIsAlwaysHorny", // required
    "length": 16, // [4, 64], defaults to 8
    "rotate": 1800, // max valid duration for every ID in seconds, defaults to 3600
    "delta": 0 // additional optional randomized duration to make it behave less predictable
}, {
    "id": "freedomForAllOrNone"
}]

There are two proposals of consuming TSIDs. TSIDs could be filled in with standard Go templates inside header value strings (e.g. Bearer {{ ts.freedomForAllOrNone }}) for maximum flexibility, or an additional property called tsHeader could be used for a potential save in performance. Example of StreamSettingsObject.tsHeader<TSIDConfigObject> below.

{
    "use": "theDeerIsAlwaysHorny",
    "as": "Authorization",
    "prefix": "Bearer ",
    "suffix": null
}

Intended use

TSIDs could be used anywhere where some measure of persistence is needed for client transports, mainly server-side load balancing. This can both benefit collaboration efforts and for-profit proxy vendors (sadly, but I doubt them using this anyway).

Some may argue that X-Forwarded-For or X-Real-IP is enough for that, which is never applicable in practice. A single client session could have different ports for egress, resulting in unwanted changes in X-Forwarded-For, and multiple client sessions behind NAT will be confused as one in X-Real-IP. If the proxy maintainer chooses to strip as much identifiable information as they could for clients, those headers will be absolute no-goes.

Stabilized load balancing & protecting exit IP reputation

Traditional measures of load balancing can cause erratic selection of exits. Since a client switching their exit IPs in relatively quick succession could be seen as bots, such use of LB will cause all affected exit IPs to be flagged, decreasing their IP reputation. Applying TSIDs solves this problem altogether.

Worry-free load-balanced streaming service unblocking

Adding on top of IP reputation protection, one can have load-balanced streaming service proxy detection bypass (it's a mouthful, so I'll stick to "streaming service unblocking") with multiple exits, keeping each exit below stream services' threshold while avoiding erratic exit switching.

Implementing high-availability

On top of stabilized load-balancing granted by TSID, servers with multiple hops can implement a stable client fail-over mechanism when any node inside the cluster fails. Client connections can simply be migrated over to a new available exit, ensuring stable connectivity.

Mini-Tor paired with VLESS encryption

Building on top of HA, when chaining multiple hops with VLESS encryption enabled, one can simply build a miniaturized Tor network: connections are protected at Guard, then securely delivered between Guard and End.

Potential problems to tackle

Most true duplex transports should be fine, but since transports under the Meek family (e.g. SplitHTTP) can have different TSIDs sent in edge cases, due to their reliance on state reconstructions, this could result in some connectivity loss, so a measure of persistence must be applied for Meek-like transports. If a fail-over occurs, problems can also ensue due to non-existent socket IDs. Until then, TSID could not be rolled out for Meek-likes.

Fangliding commented 4 months ago

I must mention that GlobalTransportObject is deprecated

PoneyClairDeLune commented 4 months ago

Then I guess PolicyObject should be used instead? Just a simple switch to PolicyObject.tsid[<TSIDMapObject>].

mmmray commented 4 months ago

Implementing high-availability

just so I understand this correctly, the header value is tacked onto any existing transport connection, but the intent is not to do any sort of connection migration right?

Stabilized load balancing & protecting exit IP reputation

So this implies TSIDs are rolled once per user, but not per connection, is that right? For example the ID is rolled once when user enables VPN, not for each individual WebSocket connection.

can have different TSIDs sent in edge cases, due to their reliance on state reconstructions

Assuming that these session IDs always flow from client to server, one can send the same set of headers for both upload and download. If you want to do load balancing on splithttp based on consistent hashing of TSID, this seems like the most useful option.


also, where do you want transport these TSID? If it is HTTP headers, then it will only work for some transports (and I think the changes will be very invasive), but if it is something like VLESS extended data, it cannot be read by nginx.

If it's HTTP headers, I think it will also interact poorly with mux as you have to merge multiple unrelated connections into one.

Fangliding commented 4 months ago

@RPRX How do you think

yuhan6665 commented 4 months ago

The current implementation is towards connection level (source IP and Port) global id. You can see in https://xtls.github.io/development/protocols/muxcool.html#%E6%96%B0%E5%BB%BA%E5%AD%90%E8%BF%9E%E6%8E%A5-new and search for logic around "GlobalID"

Maybe, we can added it to TCP as well? and put it in as part of the VLESS protocol and not in the transport? think of tcp level split, rather than splithttp

PoneyClairDeLune commented 4 months ago

@mmmray

just so I understand this correctly, the header value is tacked onto any existing transport connection, but the intent is not to do any sort of connection migration right?

The intention is to allow connections from a specific to be migrated by the middleware on the server-side without mischaracterizing them to be from another client. The migration process does not happen in Xray.

So this implies TSIDs are rolled once per user, but not per connection, is that right? For example the ID is rolled once when user enables VPN, not for each individual WebSocket connection.

cough Xray is not a VPN cough

Once per client launch, correct. And the user can optionally enable rolling with a self-defined interval and jitter if they wish.

Also, where do you want transport these TSID? If it is HTTP headers, then it will only work for some transports (and I think the changes will be very invasive), but if it is something like VLESS extended data, it cannot be read by nginx.

Yup, only for transports designed to be capable of reverse proxying. In the case of HTTP-based transports (WS, HTTP/2, HTTPUpgrade and SplitHTTP), it must be present in the headers to allow the middlewares conduct proper backend switching.

If it's HTTP headers, I think it will also interact poorly with mux as you have to merge multiple unrelated connections into one.

Actually, that shouldn't be much of a problem, as TSID only intends to distinguish clients instead of individual connections. As one of its primary use cases is to minimize the erratic exit switching behaviour, assigning a single client a singular backend should be the ideal outcome.

PoneyClairDeLune commented 4 months ago

@yuhan6665 It's not intended to be used in multiplexing, as those are better put into somewhere higher in the hierarchy. TSIDs are to be consumed by the middlewares, not Xray backends.

Had an idea about USID (universal session ID) a few years before, if Xray considered adding similar mechanisms when matching multiple outbounds, persisting a single outbound to avoid similar erratic behaviour. But there might be multiple other components providing similar functionality, so I eventually scratched that... Though randomized IPv6 exit can benefit from some session ID mechanism, allow assigning persisting dedicated IPv6 addresses for every clients.

Fangliding commented 4 months ago

I think you can set some variables during core starting, and then call these variables in the config file like this

// some where
"variables":[
{
    "type":"str"
    "randtype": "randint",
    "range": "114-514",
    "name":"MY_TSID",
    "prefix": "foo",
    "suffix": null
}
]

It cloud be foo123 foo456 or sth else and use this at headers

"wsSettings": {
  "headers": {
    "tsid": $MY_TSID
  }
}

They can also have more uses, such as random path and mux currency

PoneyClairDeLune commented 4 months ago

@Fangliding That will incur quite some complexity, perhaps even requiring modifications within the JSON parser itself, but yeah it will be quite flexible then. Extending your thoughts, there will be quite some uses for all these randomized values.

{
    "globalVariables": [{
        "key": "theDeerIsAlwaysHorny", // the only required field
        "cast": "text", // "text", "uint", "u8/16/32/64", "int", "i8/16/32/64"; default to "text"
        "type": "b64", // "b64" for web-safe Base64, "int" for integer, only valid when casted to "text"; default to "b64"
        "min": 0, // inclusive, only valid for integers, default to 0
        "max": 65536, // exclusive, only valid for integers, default to 15
        "prefix": "Bearer ", // only valid when casted to "text"
        "suffix": null  // only valid when casted to "text"
    }, {
        "key": "freedomForAllOrNone"
    }]
}
PoneyClairDeLune commented 4 months ago

There are several methods of replacements I can currently think of without modifying the JSON parser:

  1. Via the Go string templates, but will be restricted to string values only, and will incur unnecessary calls when strings are referenced unless having a reasonably restricted use.
  2. Provide a separate config file for one-shot replacements upon launch, but cannot provide dynamically rolling random values.
  3. Combine 1 and 2 for maximum flexibility , complexity and resource consumption.
PoneyClairDeLune commented 4 months ago

Maybe there should also be a distribution curve for random integer values? Not sure though.

PoneyClairDeLune commented 3 months ago

Sifted through Xray's documentation. For randomized string values, only these places make sense.

So I'm thinking about adding a global RandomizerConfigObject... The randomized string source config could be applied via Go's template easily with some performance penalty (but not sure how much). Not sure about the randomized integers, but my proposal is here as well.

{
    "randomizer": { // RandomizerConigObject
        "string": [{ // RandomizedStringObject
            "key": "theDeerIsAlwaysHorny", // required
            "type": "int", // "b64" (default) or "int"
            "min": 0, // only valid as "int" type, default to 0
            "max": 16, // only valid as "int" type, default to 16
            "roll": 1650, // roll duration in seconds, default to 3300; 0 to disable
            "delta": 300 // roll randomizer in seconds, default to 600; 0 to disable
        }, {
            "key": "freedomForAllOrNone"
        }],
        "int": [{ // RandomizedIntegerObject
            "key": "cutenessDenialIsFutile", // required
            "cast": "int", // integer types, shorten when possible (e.g. uint, u8)... may not be needed, but let's see...
            "min": 0, // default to 1024
            "max": 16, // default to 65536
            "roll": 1650, // roll duration in seconds, default to 3300; 0 to disable
            "delta": 300 // roll randomizer in seconds, default to 600; 0 to disable
        }]
    }
}

An example of using the above randomized string source in transports, namely WebSocket.

{
    "host": "deerhorny.com",
    "path": "/example/{{ r.theDeerIsAlwaysHorny }}",
    "headers": {
        "Authorization": "Bearer {{ r.freedomForAllOrNone }}"
    }
}