ODINN-SciML / MassBalanceMachine

Global machine learning glacier mass balance model, capable of assimilating all sources of glaciological and remote sensing data
MIT License
17 stars 11 forks source link

Decide what date columns are deemed necessary for the data processing pipeline #34

Open JulianBiesheuvel opened 4 months ago

JulianBiesheuvel commented 4 months ago

For now a hard constraint on FROM and TO DATE in the dataset. But what if only one of the two is available? We have to think about this. For now, if either of the two dates is in invalid or not provided, the data recorded is deleted from the dataset.

khsjursen commented 2 months ago

Right now MBM removes data containing 99 in TO_DATE and FROM_DATE. In the WGMS dataset 99 is used not only for missing day of month (YYYYMM99), but missing month (YYYY9999). For some glaciers this removes a lot of data.

If YEAR is provided then a possibility here is to give the user the option of filling start and end dates of the hydrological year (or winter/summer period). But this is quite a strong assumption such that it should be flagged somehow..