Open peterdudfield opened 3 months ago
I looked at where channel selection happens and I think this can be achieved via a one-liner in ocf_datapipes/select/filter_channels.py
(just sort the channel list before performing selection, it should return it in the right order after that, including coord reordering)
Thats a good idea
It would be interested to know if if sel
just selected the channels, or it selects them and orders them https://github.com/openclimatefix/ocf_datapipes/blob/main/ocf_datapipes/select/filter_channels.py#L49
by trying d.sel({"variable": ["lcc", "mcc"]})
and d.sel({"variable": ["mcc", "lcc"]})
we do seem to get different results
Yes, that's what I was basing the one-liner suggestion on: sel
seems to reorder coordinates, so the order of channels depends on the order in which somebody adds them into the config, and hence is very prone to inconsistencies.
I am trying to see if I can find a solid description of the reordering somewhere in the docs
I'm not sure I follow what the issue is here.
If using the filter channels function which we use for example here then we will have the same channel ordering in training and production, even if the dataset we are selecting from has them in a different order on disk. So long as the input data config remains the same at training and production. I think this is the desired behaviour?
It would be interested to know if if
sel
just selected the channels, or it selects them and orders them https://github.com/openclimatefix/ocf_datapipes/blob/main/ocf_datapipes/select/filter_channels.py#L49by trying
d.sel({"variable": ["lcc", "mcc"]})
andd.sel({"variable": ["mcc", "lcc"]})
we do seem to get different results
Yes, this is how the selection works. It reorders the channels based on list. The first one has the channels in order lcc then mcc. The second has them in the order mcc then lcc
Yeah as long as the same config file is used it should be completely fine, I wasn't sure if that's what's happening
Detailed Description
Should we add a pipeline that orders the nwp channels alphabetically
Context
Possible Implementation