opensciencegrid / StashCache

https://opensciencegrid.org/docs/data/stashcache/overview/
Apache License 2.0
1 stars 16 forks source link

Checksum files after stashcp #29

Open bbockelm opened 7 years ago

bbockelm commented 7 years ago

If CVMFS is accessible - but transfer from CVMFS fails - we should at least take advantage of the hash:

# attr -qg hash /cvmfs/nova.osgstorage.org/flux/g4numi/v6/me000z200i/g4numiv6_minervame_me000z200i_18_0006.root
aa1c8b1ef136647a082146a890b7e283f5e6a753
# sha1sum /cvmfs/nova.osgstorage.org/flux/g4numi/v6/me000z200i/g4numiv6_minervame_me000z200i_18_0006.root
aa1c8b1ef136647a082146a890b7e283f5e6a753  /cvmfs/nova.osgstorage.org/flux/g4numi/v6/me000z200i/g4numiv6_minervame_me000z200i_18_0006.root

After transferring with xrdcp, we should validate the checksum if it is available from CVMFS.

djw8605 commented 7 years ago

I actually ran into a corner case for this.

  1. My tests create a new file with the same name as an old file.
  2. The hash in CVMFS was incorrect, and actually failed the transfer (Input/Output Error).
  3. When transferring with xrdcp, it copied the correct file. But the Checksum in CVMFS would have been of the old one.

Therefore, if a user tries to copy a file that has the same name as an old file, but with new content, then this test would fail. Matter of fact, if the transfer fails with CVMFS, but the file is in the CVMFS namespace, then we shouldn't trust CVMFS at all, it's likely it caught the same error.

bbockelm commented 7 years ago

I think this at least merits a warning in stdout!