The approach to missing data has evolved rather organically and it is time to make this more explicit. To start with the approach was to on-board data sets largely as-is, including missing data values, e.g. -9999 used in WRI sets. However now we prefer:
Use original co-ordinate reference system of on-boarded data
But use NaN, which is supported by Zarr, to identify missing values
There are some data sets where missing data behaviour requires a bit more thought. The IRIS Tropical Cyclone hazard data is one such example. Data outside of areas affected by TCs is NaN, but should this rather be interpreted as zero probability of TC? And if so do we want to have such a mapping in physrisk itself?
The approach to missing data has evolved rather organically and it is time to make this more explicit. To start with the approach was to on-board data sets largely as-is, including missing data values, e.g. -9999 used in WRI sets. However now we prefer:
There are some data sets where missing data behaviour requires a bit more thought. The IRIS Tropical Cyclone hazard data is one such example. Data outside of areas affected by TCs is NaN, but should this rather be interpreted as zero probability of TC? And if so do we want to have such a mapping in physrisk itself?
The issue is to decide the approach and document.
@EglantineGiraud FYI