informatics-lab / precip_rediagnosis

Project to use ML to re-diagnose precipitation fields from ensemble model fields
0 stars 0 forks source link

NaT values for time and forecast_reference_time when loading datasets in AzureML #84

Open hannahbrown7 opened 1 year ago

hannahbrown7 commented 1 year ago

When loading merged radar and MOGREPS-G data from azureml datasets, there appears to be an issue with how datetime values are handled (during the load process?), with a significant number of rows in columns presenting datetime information being set to NaT. This sometimes only affects the time or forecast_reference_time column, both or neither, there isn't an obvious pattern of when it occurs.

In total 1211952 / 2776212 rows contain both time and forecast reference time data.

This issue does not occur when loading the same data in on local machine/SPICE.

hannahbrown7 commented 1 year ago

For each event, the number of rows which are unaffected (e.g. have datetime values for both time and forecast ref time), then number of rows where time, forecast reference time or both are set to NaT.

prd_merged_202002_storm_dennis Both time values unaffected: 189636 Time value is NaT: 78588 Forecast reference time value is NaT: 79032 Both time values are NaT: 0

prd_merged_202002_storm_ciara Both time values unaffected: 189636 Time value is NaT: 78588 Forecast reference time value is NaT: 79032 Both time values are NaT: 0

prd_merged_202010_nswws_amber_oct Both time values unaffected: 79032 Time value is NaT: 162924 Forecast reference time value is NaT: 0 Both time values are NaT: 78588

prd_merged_202012_nswws_amber_dec Both time values unaffected: 81252 Time value is NaT: 52824 Forecast reference time value is NaT: 26196 Both time values are NaT: 0

prd_merged_202102_nswws_amber_feb Both time values unaffected: 52836 Time value is NaT: 0 Forecast reference time value is NaT: 107952 Both time values are NaT: 26196

prd_merged_202110_nswws_amber_oct Both time values unaffected: 81696 Time value is NaT: 52392 Forecast reference time value is NaT: 52896 Both time values are NaT: 0

prd_merged_202112_storm_barra Both time values unaffected: 134088 Time value is NaT: 78588 Forecast reference time value is NaT: 79320 Both time values are NaT: 0

prd_merged_2022_storm_eunice_franklin Both time values unaffected: 189636 Time value is NaT: 79032 Forecast reference time value is NaT: 78588 Both time values are NaT: 0

prd_merged_202008_storm_francis Both time values unaffected: 108468 Time value is NaT: 52392 Forecast reference time value is NaT: 52836 Both time values are NaT: 0

prd_merged_202008_storm_ellen Both time values unaffected: 105672 Time value is NaT: 0 Forecast reference time value is NaT: 189708 Both time values are NaT: 78588