jupyterhub / binderhub

Run your code in the cloud, with technology so advanced, it feels like magic!
https://binderhub.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.54k stars 388 forks source link

Count repos across federation members in quota #1461

Open minrk opened 2 years ago

minrk commented 2 years ago

This is an alternative to #1460, also for https://github.com/jupyterhub/mybinder.org-deploy/issues/2143

Unlike #1460, it keeps all the definitions of quotas and such, except:

  1. repo counts are added to health check reports (doesn't require any additional k8s API calls, since we already collect them for the count)
  2. each federation member collects repo counts from the health endpoint of each other federation member
  3. repo_quota is measured against the sum of all federation members, instead of only the current instance

This is quite a bit simpler thank #1460, but potentially more costly (I'm not sure)

Additionally, I placed the binder repo_url in pod annotations, for easier lookup.

includes #1459

choldgraf commented 2 years ago

When you say "costly" do you mean in terms of infrastructure costs? My intuition is that in the long term, our people costs are still much higher than our cloud costs, and so I am generally +1 on taking "simpler approaches that are a bit costlier in cloud".

This seems like a good iterative step forward on what we currently have to me.

minrk commented 2 years ago

When you say "costly" do you mean in terms of infrastructure costs?

I mean performance-wise (making things slower, not costing more money). This approach moves larger amounts of information around and possibly making more k8s API calls (I don't think so, since it's piggy-backing on the same calls used by the health check already called every 10s by federation-redirect).

On the other hand, #1460 results in additional HTTP requests for every build, so that might slow things down itself.

choldgraf commented 2 years ago

Without understanding the technical details that much, my vote is for whatever is conceptually simplest, and has the least amount of maintenance burden over time. It seems like that is this PR?