Open meysholdt opened 3 years ago
Excellent idea - but we really don't have time for this right now. We'll want to revisit workspace cluster selection once we make a decision on multi-meta.
Prior Art
- Collect 'gcping' data from the dashboard by @jankeromnes .
FYI, that proposal is to temporarily gather ping times to all possible GCP regions, in order to decide "where should we create a brand new cluster next?" (and then stop collecting ping times, make a decision, and create the cluster)
The proposal was not to collect ping times in order to decide "which workspace cluster should be used right now?" -- doesn't GCP's load balancer already do that automatically? How does the US vs EU selection work right now? (I assume it's not some custom code we wrote, but GCP selecting a reasonable cluster automatically -- I would hope this would also work with 3 or more clusters without requiring us to write custom code for this)
I think the selection algorithm is broken, Im from India the nearby location is EU but whenever I fire a new workspace it gets created in the US region.
Also I tried with VPN from Vienna that time it created under EU region
🤔
⚠️ Just to re-iterate: This issue suspiciously sounds like we want to re-implement something as standard as a load balancer.
I don't think we want to implement and maintain custom code that measures latency, caches it, and acts upon this data.
If possible, it would be much preferable to let Google Cloud pick the best workspace cluster automatically(!)
Inspiration: Best practices for Compute Engine regions selection > Use Cloud Load Balancing and Cloud CDN:
Cloud Load Balancing, such as HTTP(S) load balancing, TCP, and SSL proxy load balancing, let you automatically redirect users to the closest region where there are backends with available capacity.
I don't think we want to implement and maintain custom code that measures latency, caches it, and acts upon this data.
If possible, it would be much preferable to let Google Cloud pick the best workspace cluster automatically(!)
Cloud Load Balancing, such as HTTP(S) load balancing, TCP, and SSL proxy load balancing, let you automatically redirect users to the closest region where there are backends with available capacity.
The reason we need to build/maintain something ourselves is that the StartWorkspace
request which would need to be regional does not go through a regional load balancer, because it's issued from server
to ws-manager
, and not from the (regional) user's browser.
The minimal steps to make automatic cluster choices would be:
ws-eu18.gitpod.io
does not answer with 404getAllRegions
function to WorkspaceManagerClientProvider
which returns a list of ping URLs and names.createWorkspace
and startWorkspace
calls on server so that they take a cluster preference, which would then be passed in via the ExtendedUser
and become an admission preference. Note, this way the cluster preference plays nicely with the score and cluster status.Offline we discussed the option of making the workspace cluster (or region) choice explicit on the dashboard. By default we'd select the cluster with the lowest RTT (as outlined above).
However, focusing on the individual cluster instead of a region has several drawbacks:
Instead, we could introduce a region
to clusters. We'd introduce a new region field as admission constraint and on the ws-manager-bridge API. New cluster registrations could provide the region when they're registered. We'd assume that from a latency perspective all regional clusters are equivalent, i.e. a measurement to one cluster is equivalent to that of another within the same region.
Not sure why this got labeled "platform". The enhancements would mostly need to happen in components owned by the meta team.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This is still not yet fixed 🤔
From India it always choose us clusters instead of eu
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Nah. Just get the coordinate of the user via IP address and pick the nearest server. Every server should be located in a city. AFAIK Gitpod's running on GCP. Moreover, many cloud provider like CloudFlare Pages/Worker already append IP and lat/long in HTTP request header 🤭.
:wave: @geropl reopening, perhaps something we can discuss to see if it can be included in an iteration early next year?
Context: https://github.com/gitpod-io/gitpod/issues/5534#issuecomment-914967098
Problem Statement
We currently have workspace clusters in one region in the EU and one region in the US. To offer service at a good latency (e.g. < 100ms), we will need more clusters, maybe as many as one or two per continent. See https://gcping.com/ for your personal latency to every google cloud region. See the GCP network map for available regions and connections between them.
Prior Art
Proposed Solution
The user's web browser should measure the latency for every available workspace cluster and send the measurements to the gitpod-server, so that the server can make an informed decision about what workspace-cluster is best for the user.
Considerations
Proposed Design Choices:
Example Flow 1:
{'cache-key': 'FJJDSKD', "clusters": {"us07": "https://us07.gitpod.io/ping", "sing01": "https://sing01.gitpod.io/ping" } }
{"us07": 230, "sing01": 60}
Example Flow 2:
Example Flow 3: