Open COBrogan opened 2 months ago
Current workflow:
Get USGS data: $ Rscript functions/usgsdata.R '02037500' "$HOME/usgsGageCMD.csv"
Run stormSep_cmd.R: Rscript ../../meta_model/scripts/met/stormSep_cmd.R "${HOME}/usgsGageCMD.csv" 'TRUE' 'Month' "${HOME}/Gage02037500_StormflowData.csv"
Run the analysis on the storms and output STATs: Rscript ../../meta_model/scripts/met/stormAnalysis_cmd.R "${HOME}/Gage02037500_StormflowData.csv" "${HOME}/Gage2037500_StormStats.csv"
Get linear regressions: Rscript ../../meta_model/scripts/met/stormEventsLM_cmd.R "${HOME}/usgs_ws_02037500-PRISM-all.csv" "${HOME}/Gage2037500_StormStats.csv" "${HOME}/Gage2037500_StormflowData.csv" 1 "${HOME}/modelJSON.json" "${HOME}/stormsepLMResults.csv"
Run plots:
Rscript ../../meta_model/scripts/met/plotStormEvent_cmd.R "${HOME}/usgs_ws_02037500-PRISM-all.csv" "${HOME}/Gage2037500_StormflowData.csv" "${HOME}/Gage2037500_StormStats.csv" "{$HOME}/stormPlotsNew" "02037500" "PRISM"
FYI @ilonah22 I've merged this branch (geo-stormSep
) into main
. I will be working on getting the meta model running tomorrow hopefully.
@rburghol How do I specify what GEO_MET_MODEL
I'm running? I changed the following to set GEO_MET_MODEL
to storm_volume
nano /opt/model/p6/cbp_wsm/config/control/met/PRISM.con
However, when I run the following process command, an echo
check confirms that GEO_MET_MODEL
is still set to simple_lm
. Am I missing a config file somewhere?
/opt/model/meta_model/run_model raster_met PRISM usgs_ws_02021500 auto geo process 05
@COBrogan looks like you set it ip correctly to me -- you may want to verify the cbp get_config...
command that is run in geo.config
at the command line -- also make sure you run from the p6/vadeq directory.
Feel free to edit the geo.config file to debug and maybe also grep the files to see if I hard-coded that param by accident somewhere, like in raster.config
Oh!! @COBrogan I Just saw you set the config in cbp_wsm/config
. That setting should be applied in vadeq/config/...
-- it should all work then. If I indicated cbp_wsm in some of the instructions that was an error on my part -- apologies.
@rburhol The storm_volume
workflow seems to be working! I ran it with gage 02021500 and successfully generated the monthly rating file for PRISM. I haven't tested the plotting yet, but I want to work on that a little more locally first. However, I did encounter a few permissions errors that I wanted to note here:
analyze/02_model_storm
) to ensure they had execute privilege for all usersanalyze/02_model_lm_simple
and analyze_02_model_storm
don't work for me unless I sudo as you. For instance:
install -D /tmp/usgs_ws_02021500_1722430958_12797/usgs_ws_02021500-PRISM-storm_volume-model.json /media/model/met/PRISM/stats/
install: cannot create regular file '/media/model/met/PRISM/stats/usgs_ws_02021500-PRISM-storm_volume-model.json': Permission denied
This command is a product of the last line of the analyze/02
steps, which are:
install -D $json_file "$MODEL_JSON"
install -D $rate_file "$RATING_FILE"
I've been running into a few issues while trying to run the storm workflow on a different gage, but this is the only one I have gotten really stuck on so far while running stormEventsLM_cmd.R:
Error in (dateIndex - rollingDur - stormDuration + 2):dateIndex :
argument of length 0
Calls: mapply -> <Anonymous>
I have been making a few minor changes, but nothing seems to be working, so I was wondering if you had a similar issue or if this error means I messed something up in one of the earlier steps.
Hi @ilonah22 . If you pull the main
branch of the meta model repository, I suspect your error will be resolved. Just like we had trouble with the dates yesterday, I had some trouble today. This time I had to remove storms that occur after the precip record has ended. So, I had to add the following:
#Remove any storms that occur after precip record
stormStats <- stormStats[stormStats$endDate <= max(comp_data$obs_date),]
stormSepDF <- stormSepDF[stormSepDF$timestamp <= max(comp_data$obs_date),]
I also have gotten everything up and running in the meta model. We should now be able to use the model on the server to run these, which will make it much easier to run multiple scenarios (no need to enter file paths, copy+paste files, etc.). Do you have some availability this afternoon to go over that? Maybe around 1:30? We could get you set-up in the model so you can run gages easily
@rburghol after some testing, it looks like @ilonah22 also doesn't have the write permissions to /media/model/met/
. She too was getting "Permission Denied" errors when trying to run the following (which worked fine for me now). If you have some availability, I'd appreciate the help in getting her set-up. And I'm happy to hop-on a call to learn more about the groups permissions:
/opt/model/meta_model/run_model raster_met stormVol_prism usgs_ws_01634500 auto geo download
I tried sudo chmod -R 775 /media/model/met/stormVol_prism
but was told that this "Operation not permitted". I think I missing something. Ilonah is already a part of the harp group so she would have access to this folder as soon as the right (write and execute) permissions are set:
ls -l /media/model/met/
total 4
lrwxrwxrwx 1 rob harpmodelers 25 Jul 26 13:35 PRISM -> /backup/meteorology/PRISM
lrwxrwxrwx 1 rob harpmodelers 26 Jul 26 13:36 daymet -> /backup/meteorology/daymet
lrwxrwxrwx 1 rob harpmodelers 26 Jul 26 13:35 nldas2 -> /backup/meteorology/nldas2
drwxr-sr-x 6 cobrogan harpmodelers 4096 Jul 31 16:57 stormVol_prism
Here are some of my visual analyses of the ratings from 10 gages produced the by stormVol_prism workflow this week:
^ Each of the different colored lines represents one of the 10 gages I used, the red boxes show places where I thought I might see a trend.
After seeing the results of these analysis, right now my next planned steps are:
@ilonah22 I really like what I'm seeing, these are effective ways to compare multiple times series, and also two highlight the areas that you think we should pay more attention to. I would be interested in a similar thing using different rainfall data steps. With that in mind, what rainfall data sets were you using for these?
FYI working on multi gage model run but issues with sftp call in sbatch:
gage_coverage_file="usgsgageList.csv"
gage_coverage_SQLfile=usgsgageList.sql
gageSQL="
\\set fname '/tmp/${gage_coverage_file}' \n
copy ( select hydrocode
FROM dh_feature
WHERE bundle = 'watershed' AND ftype = 'usgs_full_drainage'
) to :'fname';"
# turn off the expansion of the asterisk
set -f
echo -e $gageSQL > $gage_coverage_SQLfile
cat $gage_coverage_SQLfile | psql -h dbase2 -d drupal.dh03
sftp dbase2:"/tmp/${gage_coverage_file}" "/home/cobrogan/${gage_coverage_file}"
filename="/home/cobrogan/${gage_coverage_file}"
for i in `cat $filename`; do
echo $i
done
filename="/home/cobrogan/${gage_coverage_file}"
for i in `cat $filename`; do
echo "Running: sbatch /opt/model/meta_model/run_model raster_met stormVol_prism \"$i\" auto geo"
sbatch /opt/model/meta_model/run_model raster_met stormVol_prism "$i" auto geo
done
@COBrogan One thing to check is that you have done ssh–copy-id
to dbase2 (like we did yesterday for deq1), otherwise I don't know if it will allow you to do that file retrieval automatically.
If that doesn't fix it for you, paste your error, message up here and I'll take a look .
@rburghol All of this is using PRISM, how would I use a different source? Just changing stormVol_prism to stormVol_daymet?
@ilonah22 Yep. Theoretically you can call stormVol_daymet and stormVol_nldas2 as the scenario. I haven't tested these yet so it would be interesting to see if they work.
@COBrogan One thing to check is that you have done
ssh–copy-id
to dbase2 (like we did yesterday for deq1), otherwise I don't know if it will allow you to do that file retrieval automatically.If that doesn't fix it for you, paste your error, message up here and I'll take a look .
That makes sense, but I don't have the permissions to create a home directory for myself in dbase2 (nor could I sudo to do it). Would you mind setting that up Rob? Brendan said he could set one up in dbase1, but I don't know if that's ideal since our configs all point to 2
I've kicked off a run of all our gage for this method. Got 156 gages so far. Some months have negative adjusted R^2, which is interesting. I think the adjustment has to do with sample size, so could make sense for a really poor fit with small n (e.g. perhaps a gage that only has a few years of data during our precip data source time period 1980 - present). Anyways, here's a sneak peak:
Same image, but now the R^2 axis is limited between 0 to 1
After Tuesday 8/13, it looks like 188 gages ran successfully for the prism stormsep workflow. Here's a plot of the density of the mean r-squared values for each gage and of the density of all of the r-squared values.
Thanks @ilonah22 . I'm working on debugging the remaining gages. Seems like minor stuff here and there. At least one gage was failing because there was an entire month of 0 cfs flow and hreg
was failing on it. Fixed that, but still more bugs to track down. Thanks for the plots!
@COBrogan, that sounds good, I really quickly found the gages with the highest and lowest mean r-squared to plot.
Here is the highest, with no issues I can see just from looking at the plot:
Here is the lowest (non-zero), where you can see that there are only 4 r-squared monthly values above 0 and multiple NA values:
If this happening to one gage without breaking, that probably means it is happening to others.
Morning, Great work. Wondering what the R2 metric vs. Drainage area relationship looks like.
Thanks @ilonah22. What were the gage IDs for these two examples? And were you able to look at any of the residuals for the gage that had those high R^2 values? It might give us some good insight into how the regression is performing with large + small events across the months. It is interesting that there aren't many seasonal impacts at this particular gage.
As for the missing R^2 values, I believe this is intentional. If there is insufficient data to conduct the regression, the algorithm will just return NA for that month. This could happen if the Storm Separation routine found 1 - 3 storms for that month (most likely due to a small gage record). We can look into it further, but the negative R^2 values also indicate that sample size is likely low. We can probably confirm this by looking at the residual plots for February (best performing month) and July (worst performing month)`
@COBrogan this looks really cool to me. I think that it is probably time for us to get together a good data model for storing this data on VAHydro, so that we can easily do some analysis of summaries statistics over thewhole data set.
I think at this point all gages have been run with a few exceptions:
@COBrogan - great summary, FYI I am currently doing a variation on calc_raster_ts.sh to handle resampling and then a further variation to handle resampling plus temporal dissagregation. I want to only have a single routine, but I feel like the complexity of of differences between the 3 queries warrants it. Currently tracking this in #72 though the complexity of the op probably warrants its own issue.
@COBrogan , The highest R-squared gage was 01652500 and the lowest non-zero was 01671025.
@ilonah22 I've kicked off a daymet batch run. We'll see tomorrow how many run successfully, but I definitely see the ratings files being generated so it's working for at least most of the gages!
If we look at precipitation and stormflow data on an individual scale, it appears that the total volume of precipitation may be related to the volume in the storm hydrograph. This takes a slightly different approach than correlating weekly precip and flow because now we are honing in to individual storm events and looking more specifically at how precipitation increases flow:
[X] Develop a methodology to separate out storm hydrographs to isolate individual storm events. See stormSep_USGS
[X] Calculate storm hydrograph volume for each event. This is accomplished in stormSep_USGS.R
[ ] Analyze relationship between hydrograph volume and precipitation volume See below for Culpepper gage 01665500. Looking at monthly precip and monthly storm hydrographs (lumping storms into months based solely on their start date. Storms that extend beyond the month are still included in their start month), we get pretty strong correlations in some of our winter months. October in particular has a very good correlation, close to 0.75. Running a power regression slightly worsens the R^2 values, but improves the residual error analysis: Current thoughts are that we may see good relationships with this, but we may run into some travel time issues as we move downstream. Should only cause issues for systems that have more than 1-day of travel time, which are likely to be relatively large systems (greater than ~200 square miles based on super rough math of a circular watershed with a radius of 16 miles, which assumes travel time of 1-day at 1 ft/s). Visualizations will help us get a feel for how affected our results are by travel time as we expand this routine to more gages
[X] Determine means to analyze which dataset performs the best - Following #61 , we can select the "best" dataset as that that performs the best for that month. So, we can group our events up by month across the record period and run the regression for each month and determine performance
[ ] Find means to determine missing precipitation events (e.g. low precip volume) for corresponding hydrographs
[x] Plot each storm hydrograph with precipitation and separate into its own steps
[x] Analysis steps cannot depend on a list of storm events. Because these steps have been divided out, they must instead rely on a data.frame that has storm IDs built into it and the stormStats must have the corresponding ID
Important R Scripts: stormSeparate_USGS.R, plotStorm.R, plotStormSep_cmd.R, stormAnalysis_cmd.R, stormEventsLM_cmd.R, stormSep_cmd.R
In scenario con e.g.
p6/cbp_wsm/config/control/met/stormVol_prism.con
,GEO_MET_MODEL
must bestorm__volume
Several important variables defined in
geo.config
:Several important variables defined in scenario config e.g.
stormVol_prism.con
:Example Model Run No Plots
Example Model Run With Plots
05_stormSeparate
= stormSep_cmd.R$STORM_EVENT_FLOW_FILE
Inputs$COVERAGE_FLOW_FILE
$SEARCH_ALL_STORM
$BASELINE_FLOW_METHOD
$STORM_EVENT_FLOW_FILE
06_stormStatistics
= stormAnalysis_cmd.R$STORM_EVENT_STATS_FILE
Inputs$STORM_EVENT_FLOW_FILE
$STORM_EVENT_STATS_FILE
02_model_storm
(was02_stormVolumeRegression
) calls stormEventsLM_cmd.R$MODEL_STATS
) and json output ($MODEL_JSON
) from mon_lm function Inputs$DAILY_PRECIP_FILE
$STORM_EVENT_STATS_FILE
$STORM_EVENT_FLOW_FILE
$STORM_INCLUDE_DURATION
$STORMSEP_REGRESSION_METHOD
05_plot_storm
$STORMSEP_PLOT
in config is set to TRUE Inputs$STORM_EVENT_FLOW_FILE
$STORM_EVENT_STATS_FILE
$STORM_EVENT_PLOT_DIR
$USGS_GAGE
06_plot_stormVolume
$STORMSEP_PLOT
in config is set to TRUE Inputs$DAILY_PRECIP_FILE
$STORM_EVENT_FLOW_FILE
$STORM_EVENT_STATS_FILE
$STORM_EVENT_PRECIP_PLOT_DIR
$USGS_GAGE