Open yaroslavros opened 1 month ago
The PvD can certainly list out multiple proxies that are eligible to be used, but it doesn't directly give a priority order for which one to use with preference.
We could either leave it up to clients to decide which order to try, or have some flag or number indicating preference.
SVCB records also do allow providing different services with priority, though... so if we are going via DNS, we can fail over that way.
The PvD well-known url gives information about other related processes. The client can probably trust this source of information, as the proxy is preconfigured.
In case of getting PvD from a network, I think the client should decide, in any case, it should not trust the proxies, but should rather use this info to choose from pre-configured proxies. In this case, I think the decision should be made by the client.
I know that this PAC feature is used, but I do not know how important is the order or if it is "Here are 3 proxies use any that you succeed in connecting to"
Could there be a key for "default" meaning "if all else fails...", which could have a value of either a proxy, or a token like "DIRECT"?
There are multiple scenarios when enterprise wants to signal preferences to the client from the network:
In my experience strict order of proxy preferences is widely relied on in enterprise setups.
Perhaps worth discussing at IETF121.
Looks like our future is going to look like this:
{
"proxies": [
{
"protocol": "http-connect",
"proxy": "proxy.example.org:80",
"priority": 10
},
{
"protocol": "connect-udp",
"proxy": "https://proxy.example.org/masque{?target_host,target_port}",
"priority": 20
}
]
}
Developers will have to sort based on the priority key.
Can't we rely on the fact that JSON lists are ordered, and just say that the proxies are listed in decreasing priority?
That's a good point. However, does it need to support round robin for load balancing?
That could be specified with priority keys with the same value.
Perhaps the priority keys could be optional. If absent then just use the array order. In more complex cases the priority key can be used.
I think for network provisioned proxy list it would make more sense to provide destination-centric proxy priority lists. Along the lines of:
{
"proxies": [
{
"identifier": "legacy",
"protocol": "http-connect",
"proxy": "proxy.example.org:80"
},
{
"identifier": "masque",
"protocol": "connect-ip",
"proxy": "https://proxy.example.org/masque{?target_host,target_port}"
}
],
"proxy_destinations": [
{
"matchPorts": [80, 443],
"proxies": ["legacy", "DIRECT"]
},
{
"matchDomains": ["*.internal", "*.intranet"],
"proxies": ["masque", "legacy"]
}
]
}
It's not uncommon for enterprise PAC files to carry thousands of match items and duplicating them across multiple proxy definitions feels to be unnecessary bloat, makes processing more complex and whole structure more error-prone.
Traditionally proxy load balancing with PAC files is accomplished with DNS round robin or separate sets of PAC files randomly distributed to clients. Some people also do hierarchical proxies for load balancing purpose...
Or use null
instead of "DIRECT"
to avoid error-prone reserved identifiers.
Am I getting this right? So then "proxy_destinations" is a set of rules. A rules engine will take a given URL and run it through the rules and get an identifier for a given proxy. That's similar to a case statement or if/then else chain used in JS PAC.
However, what about protocol? who really decides which protocol should be used? Is that decided based on the destination URL? Or is it the App/Browser, perhaps being hinted by the server.
Looks like our future is going to look like this:
{ "proxies": [ { "protocol": "http-connect", "proxy": "proxy.example.org:80", "priority": 10 }, { "protocol": "connect-udp", "proxy": "https://proxy.example.org/masque{?target_host,target_port}", "priority": 20 } ] }
Developers will have to sort based on the priority key.
May I add one more layer of complexity to this idea? Many enterprises will use the client's IP to determine which group of proxies to prefer, then set fail over proxies if the primaries are unavailable.
Could we add source subnets to the priority, giving a proxy config a higher priority if coming from those subnets, then a default priority for all other sources. This would allow the client to know which to prefer based on location and still be able to create a fail over list.
I think proxy-specific priority is too crude as priority might differ depending on the destination. Also it does not allow us to communicate possibility for client to do direct fallback. I plan to submit a PR by the end of this week describing my previous suggestion with list of proxies per destination group.
I am not a big fan of encoding client IP restrictions in the PvD for a number of reasons:
If client-specific configuration is needed, PvD contents should be provided depending on client IP from PvD hosting service perspective - similarly to how it is done today for PAC files. I'll submit a corresponding text in a PR to clarify this.
Opened a new issue to discuss ClientIPs to stay on topic. I agree on the priorities.
How about some kind of specific syntax to allow round-robin or load balancing if needed, like:
"proxies": ["proxy1" OR "proxy2", "proxy3"] The browser or OS should randomly choose one of the two before the comma in order to promote session reuse. If the chosen proxy is unavailable, the other is attempted before moving on to proxy3.
"proxies": ["proxy1" AND "proxy2", "proxy3"] The browser or OS should round robin between proxy1 and proxy2, use the remaining proxy if one is unavailable, then move to proxy3 if both are unavailable.
Result of proxy lookup in PAC files may include multiple proxies to fallback to aka
PROXY proxy1.example.com; PROXY proxy2.example.com
which instructs the client to try proxy1 first, fallback to proxy2 if that one is not available and does not allow direct communication with the destination if both proxies are down. Or it can tell client to go direct if proxies are not available akaPROXY proxy1.example.com; PROXY proxy2.example.com; DIRECT
.Potentially proxy fallback can be achieved with PvD if multiple proxies are happy to take a given destination (though question of priorities is unclear), however it does not instruct client whether or not it may try direct communication.
It feels to me that this should be in scope so that PvD would be able to replace all reasonable use cases of PAC files.