There was a bug in the shard_dataset.py where self.climatology is set to mean_out_data(See this).
mean_out_data here is the mean over the entire data across all longitudes and latitudes whereas the climatology definition says that mean should be taken over all data points for a given longitude and latitude.
I can confirm that this bug was not there in map_dataset.py. (See this).
I have implemented the climatology calculation for ShardDataset now and tested locally that for the same configuration both MapDataset and ShardDataset have same climatology baseline numbers.
There was a bug in the
shard_dataset.py
whereself.climatology
is set tomean_out_data
(Seethis
).mean_out_data
here is the mean over the entire data across all longitudes and latitudes whereas the climatology definition says that mean should be taken over all data points for a given longitude and latitude.I can confirm that this bug was not there in
map_dataset.py
. (Seethis
).I have implemented the climatology calculation for
ShardDataset
now and tested locally that for the same configuration bothMapDataset
andShardDataset
have same climatology baseline numbers.