PGScatalog / pgscatalog_utils

(superseded by pygscatalog) Utilities for working with PGS Catalog API and scoring files
Apache License 2.0
4 stars 3 forks source link

Add MD5 checksum validation #37

Closed ens-lgil closed 1 year ago

ens-lgil commented 1 year ago

Wondering if we should add an option to skip these MD5 checks

smlmbrt commented 1 year ago

Because it's slow or just in-case people want to ignore?

Also, I wonder if we should incorporate the md5 check here too: https://github.com/PGScatalog/pgscatalog_utils/blob/e220f141b8b73d578ebb9906f31d9abca8857790/pgscatalog_utils/download/download_scorefile.py#L76-L78

ens-lgil commented 1 year ago

It shouldn't be slow but maybe some people won't want to bother with it for the small Scoring files. Anyway this is something that can be added later if needed.

Regarding the md5 check to see if the file has been updated, I was also thinking adding it where you pointed. I just wanted to have the first type of checks (i.e. integrity of the downloaded file) done/validated before implementing the second part. For the second part, we might want to ask the users if they want to overwrite their "old" scores if a newer version is available, no ?

smlmbrt commented 1 year ago

That makes sense to me - no harm in adding a --ignore_md5 flag or something.