Azure / acr-cli

Command Line Tool for interacting with Azure Container Registry Images
MIT License
54 stars 38 forks source link

Make the number of repository fetched at once configurable to handle large registries #353

Closed JRBANCEL closed 4 weeks ago

JRBANCEL commented 1 month ago

Purpose of the PR From an empirical observation, listing the repositories is O(# of OCI objects) and not O(# of repositories). Therefore, what happens when a registry contains repositories with many objects (like hundred of thousands), listing repositories in batch of 100 breaks with an HTTP 502 because the gateway is taking too long list the repositories.

This PR makes the batch size configurable such that these scenarios can be handled. Setting it to a smaller number like 5 just works as the total time spent in the gateway to list fewer repositories is less than the timeout.

I followed the concurrency flag pattern.

JRBANCEL commented 1 month ago

@microsoft-github-policy-service agree company="UiPath"

JRBANCEL commented 1 month ago

@Wwwsylvia, @estebanreyl, @northtyphoon, @sajayantony, @shizhMSFT, @wangxiaoxuan273, @wju-MSFT can someone PTAL?

Wwwsylvia commented 1 month ago

Code LGTM. But I'm wondering if we really need this.

listing repositories in batch of 100 breaks with an HTTP 502 because the gateway is taking too long list the repositories.

Getting 502 error when listing repositories sounds like a temporary server-side issue. What do you think @northtyphoon @estebanreyl.

JRBANCEL commented 1 month ago

Code LGTM. But I'm wondering if we really need this.

listing repositories in batch of 100 breaks with an HTTP 502 because the gateway is taking too long list the repositories.

Getting 502 error when listing repositories sounds like a temporary server-side issue. What do you think @northtyphoon @estebanreyl.

In our ACR, it is not temporary. It just takes too long and timeouts as it seems proportional to the number of tags in a repository.

estebanreyl commented 1 month ago

This change seems perfectly reasonable to me, I see no reason to not approve it. The reason why repository listing might be slow at times when there are a lot of repos is due to an inefficiency on the server side on how we list repos for which we have a longstanding work item. I won't get too deep into the details here but in essence if there are a lot of repos, we can run into a noticeable delay when querying the repo metadata. It shouldn't be resulting in 502s however (maybe there is some specific network config leading to that, @JRBANCEL are you using a proxy / firewall?). I went over our logs and don't see any 502s returned for our catalog API.

JRBANCEL commented 1 month ago

It shouldn't be resulting in 502s however

To be honest, I can't find that log anymore. But, yes it was some Gateway error regarding a timeout. We don't use a proxy, this was seen from within an ACR Task.

JRBANCEL commented 1 month ago

Also PTAL at https://github.com/Azure/acr-cli/pull/354 when you get a second :pray: