Open peterdudfield opened 1 year ago
I would try the new ocf_blosc2
Blosc2 ZSTD one, as that does give better results than the original Blosc Zstd
Just so it super easy to do, do you a link to some code, where you save it using ocf_blosc2
,
Yeah, it would be something like this, which I use for the new satellite zarrs:
from ocf_blosc2.ocf_blosc2 import Blosc2
def write_to_zarr(dataset, zarr_name, mode, chunks):
mode_extra_kwargs = {
"a": {"append_dim": "time"},
"w": {
"encoding": {
"data": {
"compressor": Blosc2("zstd", clevel=5),
},
"time": {"units": "nanoseconds since 1970-01-01"},
}
},
}
extra_kwargs = mode_extra_kwargs[mode]
dataset.chunk(chunks).to_zarr(
zarr_name, compute=True, **extra_kwargs, consolidated=True, mode=mode
)
Update deployed NWP consumer (to version 1.2.2):
Updated forecaster to include OCF blosc 2 library (to version 1.3.11):
great work @devsjc , what size did the NWP go down to using this method?
Not quite there yet, I'll let you know!
@peterdudfield which repository holds the national model? I will need to update that to be able to read the newly compressed NWP data as well.
Seeing the following error in the forecaster in cloudwatch:
expected shape=(7, 24, 24, 11) actual shape (4, 24, 24, 11)
Latest version does not seem to show the same error in cloudwatch
Wrong shape error has occured again. @peterdudfield is this an expected error for the forecaster?
NWP task is exiting with an out of memory error - must be the case that compression with Blosc2 takes more memory as that's the only change that has been implemented in that container. Increasing the memory: Dev: pr: https://github.com/openclimatefix/ocf-infrastructure/pull/251 tf: https://app.terraform.io/app/openclimatefix/workspaces/nowcasting_infrastructure_development-eu-west-1/runs/run-gkqyNt5GZj8vQpKX Prod:
I just rolled back nwp to 1.2.0
from 1.2.2
on devlopment as it was causing an issue on development
Detailed Description
Would be great to reduce the size of the NWP data. I think this could be done with a better compression.
Context
Possible Implementation
use
Blosc Zstd
compression