MIT-LCP / eicu-code

Code and website related to the eICU Collaborative Research Database
https://eicu-crd.mit.edu
MIT License
320 stars 219 forks source link

checksums do not match for .csv.gz files #129

Closed horwitzr closed 4 years ago

horwitzr commented 4 years ago

The checksums for the following files do not match what is on https://physionet.org/content/eicu-crd/2.0/SHA256SUMS.txt: apacheApsVar.csv.gz apachePredVar.csv.gz carePlanEOL.csv.gz respiratoryCare.csv.gz vitalAperiodic.csv.gz vitalPeriodic.csv.gz

I downloaded all of the data from https://physionet.org/content/eicu-crd/2.0/ using the "Download the Zip file" link. When I tried the checksums on my Mac using

md5 apacheApsVar.csv.gz

I get "MD5 (apacheApsVar.csv.gz) = bac59eb77b3635c0774076e4e5df3506", which does not match "4dfa6d3d36c3f9a1853027e0ca84668cd6dd4dfae24c330e0ceae324cbfc41b4'

The same occurs for the other five files, as well.

jraffa commented 4 years ago

Hi @horwitzr,

I think you're using md5 hashes, and the hashes on physionet are sha256. Trying running:

sha256sum apacheApsVar.csv.gz

I'm not sure if OSX uses a different command than linux (maybe sha256 ?), but that works in linux. I just checked the above file and it matches.

Jesse

horwitzr commented 4 years ago

Hi @jraffa,

Thank you so much; you are correct. They do match now. However, I'm having some problems unzipping those files. When I double click on a file such as apacheApsVar.csv.gz in Finder, I get an error message that says, "Unable to expand apacheApsVar.csv.gz into eicu-collaborative-research-database-2.0. (Error 79 - Inappropriate file type or format.)

My original problem could have been summarized by saying that those initial six files were not unzipping although the others were. The solution was to use gunzip, not double click on them in Finder.