Closed leofang closed 10 months ago
cc: @jakirkham
I'm looking into an issue with one of the Cloudflare caches. It seems that it only caches .tar.bz2
files (we forgot to add .conda
when we started putting conda files in the repo for conda-forge
), so .conda
downloads would likely be much slower.
Ah ok. This makes much more sense. Thanks Carl! 🙏
Recently saw an issue where a package, libnvjitlink
, uploaded binaries for Linux and Windows at roughly the same time. However the Windows package mirrored much slower.
Edit: The Windows packages mentioned here took ~1.5hrs to mirror. This was the original build and this is the first CI build to get the package.
@barabo it makes sense to cache .conda
but is it faster? Wouldn't the CDN sync usually be the first downloader of brand-new .conda
packages from a cold cache? (But, we see plenty of downloads in the screenshot)
Windows is only downloaded once because I clicked the link in the web UI to download it. There were 0 downloads prior and no additional downloads until CDN sync completed ~1.5hrs after upload
Noticing this with cuda-nvcc
number 1
on ppc64le
(other packages uploaded at the same time are already available):
Package has been up for ~1.75hrs, but is not available from CDN (getting missing package errors when requesting it)
FYI it took >60 mins to reflect a simple channel label change: https://github.com/conda-forge/admin-requests/pull/710#issuecomment-1520547532
I want to chime in here that I am seeing CDN sync times on the conda-forge status page of over 15 minutes on a regular basis now.
@dholth and I are going to sync on this early next week. Something does seem to be going on - we'll get to the bottom of it.
Thanks Carl! 🙏
Please let us know if you need anything 🙂
We've shortened the cron interval so that updates should happen more frequently. Keep an eye on it and we'll see whether any other part of the pipeline is delayed.
The cron interval was every 10 minutes, which was fine when the job reliably ran in under 7 minutes. It recently started going over 10 for some runs, so we shortened it to 2.
I'm still looking at the logs to see if there's a way to speed it up.
Ok would be interested to know why the script is taking longer. AIUI there was some work in the past to cutdown the script runtime pretty significantly
Looking at this, I'm uncertain if we've come to a conclusion, @barabo do you think we can close this?
Reading Carl's last comment, my (potentially incorrect) understanding is the cron job used for mirroring is starting to take longer. The cause for this is unknown and being investigated. So not yet fully resolved
I'm not too worried about it yet. The cron job used to take 6-7 minutes, and now it sometimes takes a little longer (which would have caused a 10 minute delay in the past); but sometimes it still runs in < 10 minutes. We should try to vacuum the databases at least.
If it is reliably mirroring at 10min intervals great, the issues mentioned above were when +1hr mirroring times were seen
@jakirkham @leofang we're still seeing these issues with packages that were posted 23 hours ago e.g. cuda-python.
Looking into the nvidia
clone worker right now. It appears to have gotten stuck 18 hours ago and needed a restart. I believe it's done updating now.
@barabo Confirmed, I see the packages now.
Is there a way we could check on sync status for a given channel? (for when we hit similar issues in the future)
I believe you can do something like this to get a sense for when a channel subdir was last updated.
curl -Is https://conda.anaconda.org/nvidia/linux-64/repodata.json | grep last-modified
last-modified: Thu, 29 Jun 2023 18:35:12 GMT
It won't work if there are no new packages in linux-64
for that channel, but if you know that's what you're looking for it should be a good test.
Can we assume that the update job takes ~10 minutes and is run every 10 minutes (as referenced earlier in the issue)?
cc @adibbley (for awareness)
conda-forge
syncs every 10 minutes, but I think the nvidia
channel (and a few others) only sync every 20 minutes. We can look into increasing that cadence, if necessary.
We are seeing this issue with the nvidia
channel again. Could someone please take a look?
cc @raydouglass
@jakirkham we're looking into it
This was resolved at the time, closing.
Checklist
What happened?
CDN sync seems to be slower than usual this week. Taking
libcublas
as exampleI started monitoring the status via
conda search --platform linux-aarch64 libcublas
after this PR is merged and the copy to the conda-forge channel is done, and as shown above it took ~47 mins forconda search
to find it. IIRC the CDN sync time has been significantly reduced to 15-30 mins before, so this is a bit concerning.Conda Info