zarr-developers / numcodecs

A Python package providing buffer compression and transformation codecs for use in data storage and communication applications.
http://numcodecs.readthedocs.io
MIT License
126 stars 88 forks source link

Fletcher checksum codec #410

Closed rabernat closed 1 year ago

rabernat commented 1 year ago

It would be great if numcodecs could implement the Fletcher checksum algorithm in a way compatible with how hdf5 does it. In this version, the checksum is simply appended to the end of the chunk bytes. See https://github.com/fsspec/kerchunk/pull/274 for a dummy passthrough version.

There are some python implementations here:

martindurant commented 1 year ago

Some (fast) C implementations mentioned in https://stackoverflow.com/questions/40270450/correctness-of-fletcher32-checksum-algorithm (would cython well)

rabernat commented 1 year ago

And just for completeness, here is the hdf5 version as vendored by netcdf4:

https://github.com/Unidata/netcdf-c/blob/8eb71290eb9360dcfd4955ba94759ba8d02c40a9/plugins/H5checksum.c