bcdev / zappend

Robustly creating and updating Zarr data cubes from smaller subsets
https://bcdev.github.io/zappend/
MIT License
40 stars 1 forks source link

Tiles are missing when applying zappend to zarr files stored in s3 #104

Open konstntokas opened 1 week ago

konstntokas commented 1 week ago

Describe the bug Zappend has been applied to append multiple zarr files along the time axis. The resulting cube shows random blocks being filled with nan values. Running it multiple times, the missing blocks are changing. Screenshot from 2024-10-28 09-12-43

To Reproduce A data ID looks like cubes/aux/era5_small/2016_11.zarr. Screenshot from 2024-10-28 09-15-25

The following zappend config was used.

config = {
    "target_dir": f"s3://{os.environ['S3_USER_STORAGE_BUCKET']}/{data_id_era5}",
    "target_storage_options": {
        "key": os.environ["S3_USER_STORAGE_KEY"],
        "secret": os.environ["S3_USER_STORAGE_SECRET"],
    },
    "slice_storage_options": {
        "key": os.environ["S3_USER_STORAGE_KEY"],
        "secret": os.environ["S3_USER_STORAGE_SECRET"],
    },
    "force_new": True,
    "disable_rollback": True,
    "logging": "DEBUG",
}

zappend(
    (
        open_zarr(f"s3://{os.environ['S3_USER_STORAGE_BUCKET']}/{data_id}")
        for data_id in data_ids[3]
    ),
    config=config,
)

Expected behavior Zappend should just append the data cubes along a specific dimension without errors.

Python Environment