openclimatefix / nwp

Tools for downloading and processing numerical weather predictions
MIT License
9 stars 3 forks source link

Sometimes the channel labelled "si10" (10 meter wind speed) has data from the `hcct` channel in it #6

Closed JackKelly closed 2 years ago

JackKelly commented 2 years ago

The problem is in the intermediate Zarr: Sometimes the channel labelled "si10" (10 meter wind speed) has data from the hcct channel in it!

JackKelly commented 2 years ago

my guess is that, sometimes, the variables are in the wrong order before appending to the Zarr (maybe dataset.to_array() puts the channels into a non-deterministic order?)

JackKelly commented 2 years ago

rough plan of action:

if appending to existing zarr then:

JackKelly commented 2 years ago

@jacobbieker this issue might also affect the EUMETSAT native-to-Zarr conversion.... but let me do a bit more digging into the NWP to Zarr conversion...

jacobbieker commented 2 years ago

Okay! I believe Satip reorders the dims before writing to disk, but I'd have to check tomorrow.

JackKelly commented 2 years ago

OK, I've forced the order of the channels in the NWP zarr, and am writing again to /mnt/storage_ssd_8tb/data/ocf/solar_pv_nowcasting/nowcasting_dataset_pipeline/NWP/UK_Met_Office/UKV/zarr/UKV_intermediate_version_3.zarr

PR to follow...

jacobbieker commented 2 years ago

In Satip, I use transpose instead of reindex here: https://github.com/openclimatefix/Satip/blob/9091a92fc1ef05d84821d2e41dfd42734797c8fc/satip/utils.py#L214 which looking at the docs for transpose and reindex look like they do similar things, but not the same. So that could be the issue with the satellite Zarrs