Closed alimanfoo closed 1 month ago
Thanks @ahernank and @leehart. Just to check, is all the data available in the vo_afun_release_master_us_central1 bucket now? I.e., OK to merge this PR?
Yes, @alimanfoo. The data from Af1.x has been copied to vo_afun_release_master_us_central1
.
Unrelated to this PR, we still need to delete the non GT data from the release bucket.
we still need to delete the non GT data from the release bucket.
@ahernank I think we've now decided to not delete the non-GT data from the Af release bucket, and hence the Zarr metadata and this package will not need updating. With that change in mind, I've actually restored the Af1.4 non-GT data to the release bucket. Obviously, this is inconsistent with our current Ag3.x release bucket and our SNP Data Release process. See https://github.com/malariagen/vector-data-processing/issues/36
Thanks both. Surfacing discussion from this morning, we may decide to deprecated the multi-region buckets, so will leave this PR open for now.
Looking at this PR again, I think I'd be in favour of merging this anyway, as it includes some improvements to the logic around when we check the client location and how the colab location check works.
If we then deprecate the multi-region buckets we could make a subsequent PR to simplify back down.
This PR adds automatic detection of GCP region, and selects a URL for storage in the same region if available.
This should reduce some network usage costs where we are accessing data from us-central1.