Closed eseliger closed 2 months ago
@eseliger: I wouldn't store an extra column in the repo table, but rather normalize certain parts of the external_service.config
field into separate columns (i.e. url
, in this case).
Is there a delay between an update to the external_services.config
field and the URL being used in gitserver? If so, how long is it usually? Or would gitserver also read the URL from the DB? And if so, it probably currently reads the external_service_repos.clone_url
field? Is that updated atomically when external_services.config
is updated?
Also, what happens currently when a repo is contained in 3 external services, one specifies https
, one ssh
, and one no value? 😬
Is there a delay between an update to the external_services.config field and the URL being used in gitserver?
Currently, yes. The syncer needs to run and update the external_service_repos.clone_url
column.
If so, how long is it usually?
There's a default in site config, but it's configurable by site admins. repoListUpdateInterval
Or would gitserver also read the URL from the DB? And if so, it probably currently reads the external_service_repos.clone_url field?
gitserver already reads the clone URL from the DB, but it currently comes from the external_service_repos.clone_url
column. #19420 would change that to read it from the external_services
table and construct it at read time with the token.
Is that updated atomically when external_services.config is updated?
Yes, see above, by the syncer (it takes a while).
Also, what happens currently when a repo is contained in 3 external services, one specifies https, one ssh, and one no value? 😬
We currently pick a random one in gitserver, but we could easily change this logic to incorporate some heuristics. What do you think those should be?
Update: this has been worked around and now the deprecation won't break this query anymore. We likely still want to improve the performance of this query, but I think for reasonably large customer instances it's not too bad. (Was 13s on dotcom if I remember correctly, where we have a few million repos).
May be worth adding an index to the clone_url column then.
@eseliger Is this still relevant?
probably but it hasn't fallen on our feet in forever, so let's close it 🙂
Currently, this store method in batch changes requires to be able to read the
external_service_repos.clone_url
to determine whether a repo is cloned using SSH. Since that column may be encrypted in the future, a text search over it is not future-proof. Also, it is slow on Sourcegraph.com. Adding a separate column to therepo
table that stores aboolean
whether it uses SSH should help.