USGS-R / river-dl

Deep learning model for predicting environmental variables on river systems
Creative Commons Zero v1.0 Universal
21 stars 15 forks source link

consider taking out sample weights #98

Open jsadler2 opened 3 years ago

jsadler2 commented 3 years ago

i'm considering taking out the sample weights functionality. I don't think anyone is using it and it does make things more complicated in calculating the metrics.

jsadler2 commented 3 years ago

@janetrbarclay , @SimonTopp , @jzwart - Are any of you currently using this functionality or do you foresee yourself using it? The idea is that you can assign weights to individual losses. For example, in this way you could, in theory, emphasize or de-emphasize certain time periods or locations by increasing or decreasing their weights.

I built it in, but I don't actually use it and I feel like it kinda just is complicating things at this point.

janetrbarclay commented 3 years ago

I am not currently using it and haven't thought of doing that in the future. I imagine that since it's been written, it wouldn't be too hard to add it back in the future if someone wanted to use it, yes?

SimonTopp commented 3 years ago

I like the idea of being able to focus training on difficult time/space predictions, but in practice I'm not sure where it would fit into my planned workflow. I'd echo Janet's statement that you should feel free to remove it, and if for some reason I do want it down the road, I can go back and look at the code to re-incorporate it. Thanks for asking @jsadler2!

jzwart commented 3 years ago

I don't think it should be the default for calculating losses, but I have used it and will probably use it in the future. I use it for iterative training with updating paramters in DL and updating states with DA - there is uncertainty associated with state estimates for DA and I use that uncertainty to weight certain locations or time periods when re-training the DL model. I don't think I've used it for evaluation metrics during a test period though, only for training

jsadler2 commented 3 years ago

Thanks for the input. Removing the weights from the normal loss functions has been removed in #104.

@jzwart - if you are still using this, we will need to revisit how to do this. When looking at the code, I actually realized that the weights were only minimally implemented in the loss functions anyway. They were only used here to change the y_true to np.nan anywhere the weight was 0. Then those values were not used in calculating the loss.

jzwart commented 3 years ago

Gotcha. Here's where I've used weights in the past - calculating weights and weighted rmse function

SimonTopp commented 2 years ago

Re-opening this issue because there's still significant code that implies the weights still work. Specifically the entire exclude segs pipeline is dependent on the weights. We should probably remove what is no longer functional now that the option for the weighted loss function is gone.

SimonTopp commented 1 year ago

@janetrbarclay, @jzwart , @aappling-usgs. Was just chatting with Janet and this came up again. All the exclude functionality doesn't propagate through to the loss function so right now it just creates the illusion that you're excluding things. Janet and I both agree that we should just remove it so it's not misleading. It's also somewhat redundant with specifying train/val/tst segs (which is incorporated and does work). Does anyone else have hot takes on this?