populationgenomics / analysis-runner

MIT License
2 stars 4 forks source link

Add new script to zip GCS file trees and upload them to Zenodo #709

Closed jmarshall closed 3 weeks ago

jmarshall commented 3 weeks ago

Run via analysis-runner so it has direct access to our buckets, zips via Python's zipfile library, and uploads directly to a (pre-made) Zenodo deposit via Zenodo's API.

Context: https://centrepopgen.slack.com/archives/C018KFBCR1C/p1730142264294839?thread_ts=1728943125.475979&cid=C018KFBCR1C

jmarshall commented 3 weeks ago

@dancoates It's a good point — In this case, there's ~15,000 files going in to each zip archive, for a total of almost 400,000 (and most files are ~200K apiece). I don't have much test data in a test bucket, so I guess the thing to do will be to merge it and see what happens… 🤞