Closed jds485 closed 1 year ago
@jds485 This isn't ready for a pr yet but I have a couple questions. So far, I have just done comparisons of the gagesii features and the NHD conus features. I haven't looked into the area of applicability yet, I thought it would be good to just look at the raw CONUS data for a sense of how different the distributions are. I can look into the aoa and try to add those as a 3rd violin plot on these. These are for the high aggregated quantiles. I'll add another target to do the mid quantiles.
Right now I just sort of eyeballed the plots and made a list of ones that should be log transformed - things like drainage area. A lot of the plots need to be transformed but have zeros in them. So - I could either just add a constant and transform just for visualization purposes. Or the scales packages has a 'pseudo log transform' which I have never used but might be ok for our purposes?
Here are some examples: Definitely a difference in drainage area distributions
Should be transformed but has zeros
Also, the function makes these in a loop and its pretty slow. It might go faster if it were parallelized but I'm not sure I know how to do that. I'll try and look at some of the other functions that are parallelized and see if I can figure it out.
I can look into the aoa and try to add those as a 3rd violin plot on these
Just so I'm understanding what would be in that 3rd violin plot: are you thinking that you would apply the aoa method, select the regions within the aoa, and then the 3rd violin plot would be for only those retained reaches? Also, the aoa method would be applied to only the attributes that are used in the models, so some of these attributes would not have a 3rd violin.
I'll add another target to do the mid quantiles.
Sounds good. Those targets can be added after the high quantile results are working.
So - I could either just add a constant and transform just for visualization purposes.
That's fine by me. The idea of pseudo log transform makes sense, though. I've not applied, but for visualization purposes it might be better than adding an arbitrary constant.
It might go faster if it were parallelized
We can discuss in the meeting later if you can't figure it out.
@jds485 ok, I have updated these so that they are log transformed and only selected the ~46 features from the model. I forget what we decided to do about the aoa. Should I look into that? Were you going to look into it? Should I just pr this as is and we can add the aoa later if we want to?
Thanks! PR this as-is and we can add aoa as a different PR. I think that method will help for our flow metrics paper as well
Depends on #142.
Create target(s) that create maps and violin plots of the reach attributes across CONUS.