USGS-R / regional-hydrologic-forcings-ml

Repo for machine learning models for regional prediction of hydrologic forcing functions. Includes probabilistic seasonal high flow regions for CONUS, and prediction of high flow metrics for selected regions.
Creative Commons Zero v1.0 Universal
0 stars 4 forks source link

Automatic correlation + misc. fixes #156

Closed jds485 closed 1 year ago

jds485 commented 1 year ago

Adds automatic screening of highly correlated features based on a supplied correlation threshold. I implemented rank correlation with a 0.9 threshold. This is the major edit for this PR within select_features.R. With this screening, the features are only CAT and ACC, and the total number of features available for prediction is 169.

Question: I implemented preferential dropping of TOT attributes when there was an ACC attribute that was highly correlated with it. Do we have a similar preference for retaining ACC vs. CAT attributes? That preference can also be added to the automatic screening. Closes: #116

Edits one file target to return filepaths instead of the output directory. I checked all file targets for this issue and only one needed to be edited. Closes #146

Adds absolute value to correlation screening. Closes #132

I updated all targets except p6 (predict) because those take a while and I think there will be upcoming edits that also affect those prediction targets.