nextstrain / nextclade_data

Datasets for https://github.com/nextstrain/nextclade
https://clades.nextstrain.org
31 stars 25 forks source link

While doing a release, it seems the dataset that's getting released is broken (getting 404) #235

Open corneliusroemer opened 1 day ago

corneliusroemer commented 1 day ago

I've noticed that while the release is ongoing for SARS-CoV-2, the new dataset is selectable, but when you click on it you get a 404 due to README and CHANGELOG not being available.

Google Chrome 2024-10-17 18 55 56 Google Chrome 2024-10-17 18 56 11

This seems to be only happening while the release action is running.

corneliusroemer commented 1 day ago

Now things work again - so it seems to be an issue for a short time only - probably not worth investing much time to fix if it's not super easy

image
ivan-aksamentov commented 1 day ago

Yep, there might be a few seconds between new index.json is already copied to S3 and the rest of the files are still being copied plus then their Cloudfront cache invalidation completes.

index.json is not cached currently to avoid being stale (here it functions similarly to how index.html is for web apps)

There is probably a smarter way to do atomic, zero-downtime deployments. One improvement could be to reorder the copies - copy data files first, invalidate cache and then copy index.json when data it points to is in place.