simonsobs / librarian

The HERA Librarian.
BSD 2-Clause "Simplified" License
1 stars 3 forks source link

Use xxh3 for checksumming #74

Closed JBorrow closed 2 months ago

JBorrow commented 3 months ago

Our high-speed transfers are actually limited by the speed at which we can checksum files on the downstream end. This is, at the moment, done in serial.

MD5 was never designed to be a high-speed integrity check, and was initially designed for cryptographic purposes. Here, I've left that as the default (so we do not need to re-checksum all files ingested into older librarians), but we will now by default use xxh3 which is significantly more performant.

I've also told globus to do its own checksumming. We may as well get a re-transferred version 'for free' without having to spin around in the librarian.