Closed natefoo closed 3 years ago
But Maybe we disable the 'delete' step of the verifier just for a few days to make sure it works? that's the only unsafe portion and would be sufficient risk mitigation.
Do you mean the call to cleanup_file
in bin/process_urls.py
? Or somewhere else?
Thanks for the review!
I forgot to push a commit I made yesterday that re-adds requiring a hash and skips downloading if a hash isn't specified.
yes, precisely. specifically this one: https://github.com/galaxyproject/cargo-port/blob/master/bin/process_urls.py#L82
(or maybe we hide it behind an environment variable CARGO_PORT_SAFE_MODE=true or so.)
My only concern there is it may leave behind stuff I will have to find and clean up (although I am not clear on whether that fetch is to a staging area rather than the actual live depot since I didn't dig into it past this step). Also, if it's fetched but fails hash verification, the file will continue to exist and subsequent runs won't attempt to be fetched anymore due to the conditional above.
that's fair. I'm not sure either.
Thanks for looking into this; what is the time frame for deployment?
@natefoo is that enough to get it running?
I believe the Jenkins job that runs this will use the merged version so this should hopefully take effect on tomorrow's run.
This change makes it possible to verify against an MD5 hash instead of a SHA256. It also will automatically use MD5 hashes from bioconda recipes if a sha256 hash is not specified in the recipe. It will also refuse to download an archive if a hash is not provided in the TSV.
Automatic mirroring via Bioconda recipes is broken because the stored hash (an empty string) does not match the actual download's hash. At first I thought this was a problem only for Bioconductor packages because the current Bioconductor package recipes only include an md5 and not a sha256 in their
meta.yaml
s but then I discovered that the conda-to-cargo-port scripts never actually write the sha256 hash to check against. So I can't figure out how the bioconda mirroring ever worked for any package. If someone knows what is going on that would be nice since I'm not confident my changes here aren't going to break anything.The hash verification failure is the cause of the issue reported in galaxyproject/galaxy-hub#721. Right now the daily Jenkins job that runs to update cargo-port pulls all of the new versions of Bioconductor packages but fails to store them because of the hash verification failure, then repeats again the next day:
This change should fix that error. Note that non-Bioconductor pacakges and anything else that does not fetch from Git (the process for which does not do hash verification) is failing similarly. So again, I have no idea how this ever worked for Bioconda packages.