seung-lab / fpzip

Cython bindings for fpzip, a floating point image compression algorithm.
BSD 3-Clause "New" or "Revised" License
33 stars 5 forks source link

Trailing Zeros in Compression #6

Closed william-silversmith closed 5 years ago

william-silversmith commented 5 years ago

The following script using geoscience data generates erroneous compression that includes extensive zero padding at the end. It does not match the original data when decompressing.

Original discussion here: https://github.com/zarr-developers/zarr/issues/307

import fpzip
import numpy as np
import gcsfs
import pandas as pd
import xarray as xr

gcs = gcsfs.GCSFileSystem(project='pangeo-181919', token='anon')
# ds_ssh = xr.open_zarr(gcsfs.GCSMap('pangeo-data/dataset-duacs-rep-global-merged-allsat-phy-l4-v3-alt',
#                                gcs=gcs))

ds_llc_sst = xr.open_zarr(gcsfs.GCSMap('pangeo-data/llc4320_surface/SST',
                               gcs=gcs), auto_chunk=False)
data = ds_llc_sst.SST[:5, 1, 800:800+720, 1350:1350+1440].values

x = fpzip.compress(data)
print(x)