USGS-R / river-dl

Deep learning model for predicting environmental variables on river systems
Creative Commons Zero v1.0 Universal
21 stars 15 forks source link

Normalize outputs segment by segment? #77

Open jsadler2 opened 3 years ago

jsadler2 commented 3 years ago

Currently we are standardizing outputs (and inputs) by subtracting the mean and dividing by the standard deviation of each variable. The result is that the distribution has, roughly, a mean of zero and a standard deviation of 1.

For the standardization, we are using the mean and SD of all the segments and all the days (for the record, the mean and SD values are coming from sntemp outputs). We then use those values in reverse to get our predictions.

I'm wondering if we should instead standardize each segment's outputs by that segment's mean and SD.

jsadler2 commented 3 years ago

I think this might have a similar effect as training on logged flow - the smaller discharge streams would be modeled better.

jsadler2 commented 3 years ago

This idea is similar to what Kratzert et. al do except they are doing it in the loss function (they don't standardize the outputs): image (https://onlinelibrary.wiley.com/doi/full/10.1002/hyp.1280)