HARPgroup / model_meteorology

0 stars 0 forks source link

GEO_MET_MODEL option 3: Storm Hydrograph Analysis and Precip/Hydrograph Volume Analysis #60

Open COBrogan opened 2 months ago

COBrogan commented 2 months ago

If we look at precipitation and stormflow data on an individual scale, it appears that the total volume of precipitation may be related to the volume in the storm hydrograph. This takes a slightly different approach than correlating weekly precip and flow because now we are honing in to individual storm events and looking more specifically at how precipitation increases flow: stormPlot14

Important R Scripts: stormSeparate_USGS.R, plotStorm.R, plotStormSep_cmd.R, stormAnalysis_cmd.R, stormEventsLM_cmd.R, stormSep_cmd.R

In scenario con e.g. p6/cbp_wsm/config/control/met/stormVol_prism.con, GEO_MET_MODEL must be storm__volume

Several important variables defined in geo.config:

# Storm separation outputs:
STORM_EVENT_FLOW_FILE="$MET_EXPORT_DIR/$scenario/flow/${coverage}-stormevent-flow.csv"
STORM_EVENT_STATS_FILE="$MET_EXPORT_DIR/$scenario/flow/${coverage}-stormevent-stats.csv"
STORM_EVENT_PRECIP_PLOT_DIR="$MET_EXPORT_DIR/$scenario/plots/stormPlots/"

Several important variables defined in scenario config e.g. stormVol_prism.con:

#Set additional variables for Storm Separation Routine:
SEARCH_ALL_STORM="TRUE"
BASELINE_FLOW_METHOD="Month"
STORM_INCLUDE_DURATION=1
STORMSEP_REGRESSION_METHOD="LINEAR"
STORMSEP_PLOT="TRUE"

Example Model Run No Plots

MODEL_ROOT=/backup/meteorology/
MODEL_BIN=$MODEL_ROOT
SCRIPT_DIR=/opt/model/model_meteorology/sh
export MODEL_ROOT MODEL_BIN SCRIPT_DIR

cd /backup/meteorology

/opt/model/meta_model/run_model raster_met stormVol_prism usgs_ws_02021500 auto geo download
/opt/model/meta_model/run_model raster_met stormVol_prism usgs_ws_02021500 auto geo process
/opt/model/meta_model/run_model raster_met stormVol_prism usgs_ws_02021500 auto geo analyze

Example Model Run With Plots

MODEL_ROOT=/backup/meteorology/
MODEL_BIN=$MODEL_ROOT
SCRIPT_DIR=/opt/model/model_meteorology/sh
export MODEL_ROOT MODEL_BIN SCRIPT_DIR

cd /backup/meteorology

/opt/model/meta_model/run_model raster_met stormVolPlot_prism usgs_ws_02075500 auto geo download
/opt/model/meta_model/run_model raster_met stormVolPlot_prism usgs_ws_02075500 auto geo process
/opt/model/meta_model/run_model raster_met stormVolPlot_prism usgs_ws_02075500 auto geo analyze
COBrogan commented 1 month ago

Current workflow: Get USGS data: $ Rscript functions/usgsdata.R '02037500' "$HOME/usgsGageCMD.csv" Run stormSep_cmd.R: Rscript ../../meta_model/scripts/met/stormSep_cmd.R "${HOME}/usgsGageCMD.csv" 'TRUE' 'Month' "${HOME}/Gage02037500_StormflowData.csv" Run the analysis on the storms and output STATs: Rscript ../../meta_model/scripts/met/stormAnalysis_cmd.R "${HOME}/Gage02037500_StormflowData.csv" "${HOME}/Gage2037500_StormStats.csv" Get linear regressions: Rscript ../../meta_model/scripts/met/stormEventsLM_cmd.R "${HOME}/usgs_ws_02037500-PRISM-all.csv" "${HOME}/Gage2037500_StormStats.csv" "${HOME}/Gage2037500_StormflowData.csv" 1 "${HOME}/modelJSON.json" "${HOME}/stormsepLMResults.csv" Run plots: Rscript ../../meta_model/scripts/met/plotStormEvent_cmd.R "${HOME}/usgs_ws_02037500-PRISM-all.csv" "${HOME}/Gage2037500_StormflowData.csv" "${HOME}/Gage2037500_StormStats.csv" "{$HOME}/stormPlotsNew" "02037500" "PRISM"

COBrogan commented 4 weeks ago

FYI @ilonah22 I've merged this branch (geo-stormSep) into main. I will be working on getting the meta model running tomorrow hopefully.

COBrogan commented 3 weeks ago

@rburghol How do I specify what GEO_MET_MODEL I'm running? I changed the following to set GEO_MET_MODEL to storm_volume nano /opt/model/p6/cbp_wsm/config/control/met/PRISM.con However, when I run the following process command, an echo check confirms that GEO_MET_MODEL is still set to simple_lm. Am I missing a config file somewhere?

/opt/model/meta_model/run_model raster_met PRISM usgs_ws_02021500 auto geo process 05
rburghol commented 3 weeks ago

@COBrogan looks like you set it ip correctly to me -- you may want to verify the cbp get_config... command that is run in geo.config at the command line -- also make sure you run from the p6/vadeq directory.

Feel free to edit the geo.config file to debug and maybe also grep the files to see if I hard-coded that param by accident somewhere, like in raster.config

rburghol commented 3 weeks ago

Oh!! @COBrogan I Just saw you set the config in cbp_wsm/config. That setting should be applied in vadeq/config/... -- it should all work then. If I indicated cbp_wsm in some of the instructions that was an error on my part -- apologies.

COBrogan commented 3 weeks ago

@rburhol The storm_volume workflow seems to be working! I ran it with gage 02021500 and successfully generated the monthly rating file for PRISM. I haven't tested the plotting yet, but I want to work on that a little more locally first. However, I did encounter a few permissions errors that I wanted to note here:

  1. I had to manually change the model steps (e.g. analyze/02_model_storm) to ensure they had execute privilege for all users
  2. The install commands at the end of analyze/02_model_lm_simple and analyze_02_model_storm don't work for me unless I sudo as you. For instance:
    install -D /tmp/usgs_ws_02021500_1722430958_12797/usgs_ws_02021500-PRISM-storm_volume-model.json /media/model/met/PRISM/stats/
    install: cannot create regular file '/media/model/met/PRISM/stats/usgs_ws_02021500-PRISM-storm_volume-model.json': Permission denied

    This command is a product of the last line of the analyze/02 steps, which are:

    install -D $json_file "$MODEL_JSON" 
    install -D $rate_file "$RATING_FILE"
ilonah22 commented 3 weeks ago

I've been running into a few issues while trying to run the storm workflow on a different gage, but this is the only one I have gotten really stuck on so far while running stormEventsLM_cmd.R:

Error in (dateIndex - rollingDur - stormDuration + 2):dateIndex :
  argument of length 0
Calls: mapply -> <Anonymous>

I have been making a few minor changes, but nothing seems to be working, so I was wondering if you had a similar issue or if this error means I messed something up in one of the earlier steps.

COBrogan commented 3 weeks ago

Hi @ilonah22 . If you pull the main branch of the meta model repository, I suspect your error will be resolved. Just like we had trouble with the dates yesterday, I had some trouble today. This time I had to remove storms that occur after the precip record has ended. So, I had to add the following:

#Remove any storms that occur after precip record
stormStats <- stormStats[stormStats$endDate <= max(comp_data$obs_date),]
stormSepDF <- stormSepDF[stormSepDF$timestamp <= max(comp_data$obs_date),]

I also have gotten everything up and running in the meta model. We should now be able to use the model on the server to run these, which will make it much easier to run multiple scenarios (no need to enter file paths, copy+paste files, etc.). Do you have some availability this afternoon to go over that? Maybe around 1:30? We could get you set-up in the model so you can run gages easily

COBrogan commented 3 weeks ago

@rburghol after some testing, it looks like @ilonah22 also doesn't have the write permissions to /media/model/met/. She too was getting "Permission Denied" errors when trying to run the following (which worked fine for me now). If you have some availability, I'd appreciate the help in getting her set-up. And I'm happy to hop-on a call to learn more about the groups permissions:

/opt/model/meta_model/run_model raster_met stormVol_prism usgs_ws_01634500 auto geo download

I tried sudo chmod -R 775 /media/model/met/stormVol_prism but was told that this "Operation not permitted". I think I missing something. Ilonah is already a part of the harp group so she would have access to this folder as soon as the right (write and execute) permissions are set:

ls -l /media/model/met/
total 4
lrwxrwxrwx 1 rob      harpmodelers   25 Jul 26 13:35 PRISM -> /backup/meteorology/PRISM
lrwxrwxrwx 1 rob      harpmodelers   26 Jul 26 13:36 daymet -> /backup/meteorology/daymet
lrwxrwxrwx 1 rob      harpmodelers   26 Jul 26 13:35 nldas2 -> /backup/meteorology/nldas2
drwxr-sr-x 6 cobrogan harpmodelers 4096 Jul 31 16:57 stormVol_prism
ilonah22 commented 2 weeks ago

Here are some of my visual analyses of the ratings from 10 gages produced the by stormVol_prism workflow this week:

MonthlyRSquared ^ Each of the different colored lines represents one of the 10 gages I used, the red boxes show places where I thought I might see a trend.

MonthlyBoxplot Seasonal Rsquared

After seeing the results of these analysis, right now my next planned steps are:

rburghol commented 2 weeks ago

@ilonah22 I really like what I'm seeing, these are effective ways to compare multiple times series, and also two highlight the areas that you think we should pay more attention to. I would be interested in a similar thing using different rainfall data steps. With that in mind, what rainfall data sets were you using for these?

COBrogan commented 2 weeks ago

FYI working on multi gage model run but issues with sftp call in sbatch:

gage_coverage_file="usgsgageList.csv"
gage_coverage_SQLfile=usgsgageList.sql
gageSQL="
\\set fname '/tmp/${gage_coverage_file}' \n

copy ( select hydrocode
FROM dh_feature
WHERE bundle = 'watershed' AND ftype = 'usgs_full_drainage'
) to :'fname';"
# turn off the expansion of the asterisk
set -f
echo -e $gageSQL > $gage_coverage_SQLfile 
cat $gage_coverage_SQLfile | psql -h dbase2 -d drupal.dh03
sftp dbase2:"/tmp/${gage_coverage_file}" "/home/cobrogan/${gage_coverage_file}"

filename="/home/cobrogan/${gage_coverage_file}"
for i in `cat $filename`; do
    echo $i
done

filename="/home/cobrogan/${gage_coverage_file}"
for i in `cat $filename`; do
  echo "Running: sbatch /opt/model/meta_model/run_model raster_met stormVol_prism \"$i\" auto geo"
  sbatch /opt/model/meta_model/run_model raster_met stormVol_prism "$i" auto geo 
done
rburghol commented 2 weeks ago

@COBrogan One thing to check is that you have done ssh–copy-id to dbase2 (like we did yesterday for deq1), otherwise I don't know if it will allow you to do that file retrieval automatically.

If that doesn't fix it for you, paste your error, message up here and I'll take a look .

ilonah22 commented 2 weeks ago

@rburghol All of this is using PRISM, how would I use a different source? Just changing stormVol_prism to stormVol_daymet?

COBrogan commented 2 weeks ago

@ilonah22 Yep. Theoretically you can call stormVol_daymet and stormVol_nldas2 as the scenario. I haven't tested these yet so it would be interesting to see if they work.

COBrogan commented 2 weeks ago

@COBrogan One thing to check is that you have done ssh–copy-id to dbase2 (like we did yesterday for deq1), otherwise I don't know if it will allow you to do that file retrieval automatically.

If that doesn't fix it for you, paste your error, message up here and I'll take a look .

That makes sense, but I don't have the permissions to create a home directory for myself in dbase2 (nor could I sudo to do it). Would you mind setting that up Rob? Brendan said he could set one up in dbase1, but I don't know if that's ideal since our configs all point to 2

COBrogan commented 2 weeks ago

I've kicked off a run of all our gage for this method. Got 156 gages so far. Some months have negative adjusted R^2, which is interesting. I think the adjustment has to do with sample size, so could make sense for a really poor fit with small n (e.g. perhaps a gage that only has a few years of data during our precip data source time period 1980 - present). Anyways, here's a sneak peak: image

Same image, but now the R^2 axis is limited between 0 to 1 image

ilonah22 commented 1 week ago

After Tuesday 8/13, it looks like 188 gages ran successfully for the prism stormsep workflow. Here's a plot of the density of the mean r-squared values for each gage and of the density of all of the r-squared values.

Rplot

R-squaredDensity_total

COBrogan commented 1 week ago

Thanks @ilonah22 . I'm working on debugging the remaining gages. Seems like minor stuff here and there. At least one gage was failing because there was an entire month of 0 cfs flow and hreg was failing on it. Fixed that, but still more bugs to track down. Thanks for the plots!

ilonah22 commented 1 week ago

@COBrogan, that sounds good, I really quickly found the gages with the highest and lowest mean r-squared to plot.

Here is the highest, with no issues I can see just from looking at the plot:

MonthlyRHighest

Here is the lowest (non-zero), where you can see that there are only 4 r-squared monthly values above 0 and multiple NA values:

MonthlyRLowest LowestDataTable

If this happening to one gage without breaking, that probably means it is happening to others.

durelles commented 1 week ago

Morning, Great work. Wondering what the R2 metric vs. Drainage area relationship looks like.

COBrogan commented 6 days ago

Thanks @ilonah22. What were the gage IDs for these two examples? And were you able to look at any of the residuals for the gage that had those high R^2 values? It might give us some good insight into how the regression is performing with large + small events across the months. It is interesting that there aren't many seasonal impacts at this particular gage.

As for the missing R^2 values, I believe this is intentional. If there is insufficient data to conduct the regression, the algorithm will just return NA for that month. This could happen if the Storm Separation routine found 1 - 3 storms for that month (most likely due to a small gage record). We can look into it further, but the negative R^2 values also indicate that sample size is likely low. We can probably confirm this by looking at the residual plots for February (best performing month) and July (worst performing month)`

rburghol commented 6 days ago

@COBrogan this looks really cool to me. I think that it is probably time for us to get together a good data model for storing this data on VAHydro, so that we can easily do some analysis of summaries statistics over thewhole data set.

COBrogan commented 6 days ago

I think at this point all gages have been run with a few exceptions:

  1. usgs_ws_03208800. Recorded no data during the precip record of 1980 - Present and thus could not be run
  2. usgs_ws_02079500. Recorded no data during the precip record of 1980 - Present and thus could not be run
  3. usgs_ws_01636210. Recorded no data during the precip record of 1980 - Present and thus could not be run
  4. usgs_ws_01645784. Too small to run with PRISM under current methodology due to its 0.79 square mile DA
  5. usgs_ws_02050000 did not download, likely due to the high amounts of provisional data at this tidal gage
  6. usgs_ws_01658500 and 02037000 (Richmond Canal) something wrong in stormAnalysis, resulted in 18 warnings but process finished
rburghol commented 5 days ago

@COBrogan - great summary, FYI I am currently doing a variation on calc_raster_ts.sh to handle resampling and then a further variation to handle resampling plus temporal dissagregation. I want to only have a single routine, but I feel like the complexity of of differences between the 3 queries warrants it. Currently tracking this in #72 though the complexity of the op probably warrants its own issue.

ilonah22 commented 20 hours ago

@COBrogan , The highest R-squared gage was 01652500 and the lowest non-zero was 01671025.

COBrogan commented 14 hours ago

@ilonah22 I've kicked off a daymet batch run. We'll see tomorrow how many run successfully, but I definitely see the ratings files being generated so it's working for at least most of the gages!