USGS-R / river-dl

Deep learning model for predicting environmental variables on river systems
Creative Commons Zero v1.0 Universal
21 stars 14 forks source link

Pad training/validation/testing #217

Closed jds485 closed 1 year ago

jds485 commented 1 year ago

Currently, data may be trimmed if the sequence length is not an exact multiple of the training/validation/testing length. We should pad these datasets as a function of the sequence length to ensure that all data within the specified start and end dates are used.

I think padding NaN values to target variables at the end of the datasets will make the most sense so that there are no issues with cell and hidden states from adding synthetic data.

Will need to check that the full timeseries removes the padded data.