Landuse + CBP flow widget

rburghol commented 5 years ago

Overall Workflow for Openmi-OM component

Generate shell script to loop through all landsegment outflow WDMs for all land uses in a segment, see WDM Formatting below
take input from CBP runoff/iflow/gwflow WDM exports (see below WDM Data Formatting)
multiply by landuse grid (rows match csv columns)
- Obtain from VAHydro (TBD - for now, play with
sum all runoff
produce text file with date and runoff total and gw iflow totals, and total Qout, and Runit
make sure file format is compatible with OM runtime timeseries file used in cache mode
save file name with OM runfile convention with elementid and runid

WDM Data Formatting This will be a preliminary step in this evaluation, exporting the data to use in your widget. For the short term, we will just export the data for one land segment, for a river segment of interest

Options:
- Write in shell script?
- Write in R, calling "system" command to execute quick_wdm - https://stat.ethz.ch/R-manual/R-devel/library/base/html/system.html
- How do we know what land uses to parse?
- Maybe obtain a directory listing? (ex: http://deq2.bse.vt.edu/p6/p6_gb604/tmp/wdm/land/ )
- Parse a CBP config file?
- Use a pre-configured list (my least favorite alternative)
Locate CBP runoff/iflow/gwflow WDMs - these WDM files contain "unit area" runoff values
- Example:
- river segment: OR2_8130_7900
- land segment: A51121
- wdm: p532c-sova/wdm/land/for/p532cal_062211/forA51121.wdm
- uci: p532c-sova/uci/land/for/p532cal_062211/forA51121.uci
- DSN: 111 (SURO, ak surface runoff)
Export Data from WDM (will produce 3 fils fro each land use IFWO, SURO, AGWO) Use quick_wdm_2_txt_hour_2_hour to convert wdms to csvs. Steps are as follows:
- ssh into deq2.bse.vt.edu
- cd /opt/model/p53/p532c-sova/tmp/wdm/land/(3 LETTER LAND USE)/p532cal_062211
- /opt/model/p53/p532c-sova/code/bin/quick_wdm_2_txt_hour_2_hour
- input wdm name, start year, end year, dsn
- ex: afoA51121.wdm, 1984, 2005, 111
- navigate to location on http://deq2.bse.vt.edu/p532c-sova/ in web browser, download -- EDIT: this step is obsolete -- the script can pull the wdms straight from the URL, no need to download files.
Reformat WDM files into best format (we may need to experiment, the first option is easy and simple):
- 1 File per Land Segment, 1 column for each land use with summed IFWO, SURO and AGWO
- timestamp: the timestamp of the model output
- afo
- for
- ...
- Try 1 file per landuse and land segment, with 4 columns
- timestamp: the timestamp of the model output
- suro: suface runoff -- DSN 111
- ifwo: interflow -- DSN 211
- agwo: "active" groundwater -- DSN 411
Model Output file: a csv output
- Name: runlog[runid].[vahydro om elementid].log -- example: runlog2.256687.log
- Columns (at minimum, you may include more than this, and we may later REQUIRE more than this)
- timestamp
- Qout
- area_sqmi

Example Land Use Table This is a Phase 5.3 table example, so some names may have changed in 5.3.2.

luname	1984	1987	1992	1997	2002	2005
afo	0.91	0.86	0.86	0.79	0.66	0.66
alf	9.83	10.27	11.29	9.13	11.91	7.83
bar	0.80	0.80	0.81	0.84	0.87	0.87
css	0.00	0.00	0.00	0.00	0.00	0.00
ext	25.40	25.40	25.40	25.40	25.40	25.40
for	1187.98	1190.02	1184.11	1197.78	1203.24	1216.05
hom	1.26	1.17	1.72	1.56	2.06	1.90
hvf	12.00	12.02	11.96	12.10	12.15	12.28
hwm	8.79	6.85	5.94	2.28	3.84	3.86
hyo	8.61	11.09	7.19	9.54	7.62	6.91
hyw	31.24	33.18	33.08	36.18	38.81	39.90
imh	12.30	12.30	12.30	12.80	13.20	13.30
iml	28.80	28.80	28.80	28.80	28.80	28.80
lwm	18.29	16.39	17.83	14.26	11.32	11.39
nal	0.00	0.00	0.00	0.00	0.00	0.00
nhi	0.00	0.00	0.00	0.00	0.00	0.00
nho	0.00	0.00	0.00	0.00	0.00	0.00
nhy	0.00	0.00	0.00	0.00	0.00	0.00
nlo	0.00	0.00	0.00	0.00	0.00	0.00
npa	0.00	0.00	0.00	0.00	0.00	0.00
pas	130.87	128.08	135.54	125.50	117.21	108.50
puh	11.40	11.40	11.40	11.80	12.10	12.00
pul	144.00	144.00	144.00	144.00	144.00	144.00
trp	6.89	6.74	7.13	6.61	6.17	5.71
urs	0.00	0.00	0.00	0.01	0.00	0.00
wat	4.00	4.00	4.00	4.00	4.00	4.00

Table 2: Sam[ple export from http://deq2.bse.vt.edu/p532c-sova/wdm/land/p532cal_062211_A51121_eos_all.csv

ix	Year	Month	Day	Hour	afo_0111	alf_0411	ccn_0411	cex_0411	cfo_0111
1	1984	1	1	1	0.000171	1.98E-05	1.98E-05	1.98E-05	0.000171
2	1984	1	1	2	0.0001	1.98E-05	1.98E-05	1.98E-05	0.0001
3	1984	1	1	3	6.61E-05	1.98E-05	1.98E-05	1.98E-05	6.61E-05
4	1984	1	1	4	4.67E-05	1.98E-05	1.98E-05	1.98E-05	4.67E-05
5	1984	1	1	5	3.47E-05	1.98E-05	1.98E-05	1.98E-05	3.47E-05
6	1984	1	1	6	2.68E-05	1.98E-05	1.98E-05	1.98E-05	2.68E-05
7	1984	1	1	7	0.000179	1.98E-05	1.98E-05	1.98E-05	0.000179
8	1984	1	1	8	0	1.98E-05	1.98E-05	1.98E-05	0
9	1984	1	1	9	0	1.98E-05	1.98E-05	1.98E-05	0
10	1984	1	1	10	0	1.98E-05	1.98E-05	1.98E-05	0
11	1984	1	1	11	0	1.98E-05	1.98E-05	1.98E-05	0
12	1984	1	1	12	0	1.98E-05	1.98E-05	1.98E-05	0
13	1984	1	1	13	0	1.98E-05	1.98E-05	1.98E-05	0

rburghol commented 5 years ago

The code for this seems to work nicely @hdaniel7 . The basic script "CBP_flow_widget.R", does the following (from my testing it, this is what I think):

Downloads all the wdm files that have already been parsed as csv.
Merge these into a single large data stored somewhere in memory, under a variable name in the form '[landsegment][luname]_ALL'.
Multiply those by a landuse array that you imported from a file.
Writes a file with the resulting edge of stream input flows.

That seems to work well, and I'm impressed at how rapid it works given how damn much data is in memory! The next steps will be needed to make this more modular, and to save time (no multiple downloads).:

Create a single merged CSV, named "[cbp_scenario][landsegment]_eos_all" with [luname_111], [luname_211], [luname_411] for all (90 columns!). We need to test and see if this will be reasonable to work with, if so, it is the easiest for us to keep track of, i.e., 1 single file for each landuse and scenario, example would be p532cal_062211_A51121_eos_all.csv'
We will use the landuse * Runit code later in a separate script -- so don't toss it! But we will use it with the new eos_all file.
Put all future code in a directory called "R/utils" to keep things clean -- I created R/utils and moved CBP_flow_widget.R there already.
Add a script to export the single land use bash that you ran to the new directory "sh" (I just added this to the master and pushed it).

hdaniel7 commented 5 years ago

Created and pushed land_use_eol_all.R -- this script generates a single .csv file named "[cbp_scenario][landsegment]_eos_all" with [luname_111], [luname_211], [luname_411] for all. The generated .csv seems to be reasonably easy to work with -- however, it is approximately 150mb and is too large to upload to github.

As follows is visual documentation of the script's functions: land_use_eos_all

It does not currently export this generated .csv to a new directory called "sh", it simply exports it to "utils" -- I did a fresh pull from the master branch but could not find the "sh" directory. @rburghol should this "sh" directory be located at "utils/sh"? And any ideas on next steps?

rburghol commented 5 years ago

Any ideas on next steps? Per the meeting today - stash that sh code in the cbp6 project.

hdaniel7 commented 5 years ago

On previous read-throughs I thought you wanted me to export the full land use .csv file to cbp6/utils/sh -- upon re-examination, I'm pretty confident that you're asking me to write a shell script to generate the individual land use .csvs on the deq2 machine, and to then stash that script in cbp6/utils/sh -- correct? I can try to do this but I have never written a shell script before -- I can do some research into it and try to knock it out, if this script is what you want.

rburghol commented 5 years ago

Sorry - what I meant by:

Add a script to export the single land use bash that you ran to the new directory "sh" (I just added this to the master and pushed it).

Was to "export the runoff data for a single land use...", using quick_wdm_2_txt_hour_2_hour . I was assuming that you didn't manually export each Land Segment runoff WDM, rather, you wrote a shell script to do so (albeit a simplistic one). So, that makes me ask:

How did you generate the land segment runoff files that you parsed in "CBP_flow_widget.R"??

hdaniel7 commented 5 years ago

I actually did just run quick_wdm_2_txt_hour_2_hour enough times to generate each land use .csv for one land segment, and have been using this land segment in all testing of the code so far -- none of the other land segments have any .csv files generated at all! I spent the last half hour or so looking into shell scripts a bit -- definitely would have been a more effective method, it seems.

rburghol commented 5 years ago

Just catching up on this -- yeah, shell would be great! I AM concerned about the file sizes that you indicate though... gonna do some thinking there. It might be prudent for us to add a zip and unzip to the steps. Can you give me the full file name and directory path the one CSV that you generated as a test?

hdaniel7 commented 5 years ago

fn_land_use_eos_all.R has been updated in the openmi branch (not yet in cbp6, as it lacks roxygen headers and other aspects required for incorporation into the cbp6 package, at the moment). The unlabeled "observation number" row at the start of the output file has been removed, the year/month/day/hour columns have been combined to form a "timestamp" column, and the individual land use columns have been changed from the format "afo_0111" to "afo_suro". An example capture of the table is as follows:

timestamp	afo_suro	alf_agwo	ccn_agwo
1/1/1984 1:00	0.000171	1.98E-05	1.98E-05
1/1/1984 2:00	0.0001	1.98E-05	1.98E-05
1/1/1984 3:00	6.61E-05	1.98E-05	1.98E-05
1/1/1984 4:00	4.67E-05	1.98E-05	1.98E-05
1/1/1984 5:00	3.47E-05	1.98E-05	1.98E-05
1/1/1984 6:00	2.68E-05	1.98E-05	1.98E-05
1/1/1984 7:00	0.000179	1.98E-05	1.98E-05
1/1/1984 8:00	0	1.98E-05	1.98E-05
1/1/1984 9:00	0	1.98E-05	1.98E-05

The full table can be found at http://deq2.bse.vt.edu/p532c-sova/wdm/land/, until a better location to store it is determined.

That being said, the function still receives the list of land uses hardcoded through the line " land.use.list <- c('afo','alf','ccn','cex','cfo','cid','cpd','for','hom','hvf','hwm','hyo','hyw','lwm','nal','nex','nhi','nho','nhy','nid','nlo','npa','npd','pas','rcn','rex','rid','rpd','trp','urs')" -- these land uses are only accurate for phase 5 of the model. I will begin looking into dynamically reading in the land use input list by looking at which directories are within the "/wdm/land" directory of the inputted scenario.

hdaniel7 commented 5 years ago

The land use list is now generated by reading the list of directories within the "/tmp/wdm/land" directory of the inputted model. The list.dirs() function I used only lists directories, and not files, so the output files from this script can still be stored in the "/tmp/wdm/land" directory, if desired.

rburghol commented 5 years ago

Good stuff!!!

rburghol commented 5 years ago

@hdaniel7 -- this file looks good. I had to change the header for the time column to "thisdate" instead of "timestamp". There was an undocumented "feature" that if you use "timestamp" it assumes it is already in proper unix integer format, but if you use "thisdate" it will translate it from a string. I think the string is just fine, and perhaps preferable for our purposes. Sorry I gave the wrong column name.

I still have to work some bugs out with using this in the model, but we are looking good!

rburghol commented 5 years ago

Model object loads the file, adds that file to a database table, and caches that table permanently (or until we update the file or ask it to be removed). The routine that finds the runoff values and multiplies it against the land use area is not yet functioning. Not sure if the cached table part is a problem or something else. Will sort it out tomorrow.

rburghol commented 5 years ago

Need to test:

review parse routine of file with shortened version -- why are timestamps being truncated? Are we interpretting the "thisdate" field as a date only? Maybe pass to dh handletimestamp routine?
Load object
init()
add a timer
call getCurrentDataSlice()
debug?
document all methods for timeseries object base class and timeseriesfile, and cbprunofffile class

hdaniel7 commented 5 years ago

Sounds good! I can change the timestamp column name to "thisdate" and push the changes. That being said, should I begin work on the routine that finds runoff values and multiplies them against the land use area? Or is there any of the testing line items that I should focus on?

rburghol commented 5 years ago

Let's make the timestamp to thisdate change, and then focus on:

figuring out a good directory structure to store this data, for example maybe p6/p6_gb604/out/p532cal_062211/A51001_eos_all.csv
Developing a script to iterate through all the landsegs in the P6 model and doing this export.

I will keep you posted as I work out the rest of this modeling widget.

hdaniel7 commented 5 years ago

Alright -- the timestamp to thisdate change has been made.

All 3 of us analysts will be working part time from home next week -- I'll spend my time today helping them finish up their scenario comparison analysis dashboard, and then I'll start work on the script on Monday.

By "develop a script to iterate through all the landsegs in the P6 model and do this export", are you referring to the generation of the 90 land use unit flow files for each land segment? I began work on an R script to do this when I had some free time on Wednesday, but ran into issues with passing arguments (wdm name, start date, end date, DSN) to the quick_wdm_2_txt function -- or perhaps more likely, the issue may be that I cannot pass the "enter" keystroke required after inputting these arguments which would allow the quick_wdm_2_txt function to execute. I looked around for a solution but believe that I'll have to perform this process with a shell script -- unless you have another idea?

rburghol commented 5 years ago

Your work plans for today and next week sounds good to me! As for the script -- yes -- that's what I was thinking! In order for me to give you a hand, post up the code that you attempted. I presume that you were using the "system" command. If so, it might be as simple as using a "pipe", that is, the "|" symbol to send input to another script, which results in more or less pressing the enter key. I do this from a shell script in my old 5.32 data harvesting scripts::

echo "river.wdm 1984 2005 $3" | /opt/model/p53/p532c-sova/code/bin/quick_wdm_2_txt_hour_2_hour

Where "river.wdm" is whatever the name of the WDM you are exporting (should work with a landseg WDM as well). So, it might be as simple as this using the R system command:

system("echo 'river.wdm 1984 2005 $3' | /opt/model/p53/p532csova/code/bin/quick_wdm_2_txt_hour_2_hour")

But of course you need to use the p6 version of quick_wdm_2_txt_hour_2_hour

hdaniel7 commented 5 years ago

Alright -- my first attempts at the script are in ssh.run.quick.wdm within the R/utils directory. I've been using the ssh_exec command instead of system -- from what I understand, exec() is a new-and-improved version of the system2() and system() commands, and I've been running the ssh versions so that I can test it from my desktop without having to upload the R script to deq2. However, your piping solution looks like it might be what I was searching for -- I'll give that a go and get back to you.

hdaniel7 commented 5 years ago

Yup! The piping was the exact syntax that I needed -- I'll upload the working version in a minute (although it is currently just hard-coded to generate afoA10001.csv) and start working on generalizing it and putting it in a format runnable from deq2 either later today or on Monday.

rburghol commented 5 years ago

Tested the following sequence (combo shell + R):

cd /opt/model/p6/p6_gb604/tmp/wdm/land/aop/CFBASE30Y20180615
echo "aopN51121.wdm,1984,2014,0111" |  /opt/model/p6-devel/p6-utils/code/bin/quick_wdm_2_txt_hour_2_hour
>  program to write hourly ascii output from a wdm
> wdm name, start year, end year, dsn
> hourly average =    2.0688814570717515E-004
> annual average =    1.8136215281360086

# Enter R
R
source("/home/rob/openmi-om/R/utils/fn_land_use_eos_all.R")
land.use.eos.all('N51121','/opt/model/p6/p6_gb604','CFBASE30Y20180615','/opt/model/p6/p6_gb604/out/CFBASE30Y20180615')
> [1] "Downloading 1 of 129"
> [1] "Downloading 2 of 129"
> Error in file(file, "rt") : cannot open the connection
> In addition: Warning message:
> In file(file, "rt") :
>   cannot open file > '/opt/model/p6/p6_gb604/tmp/wdm/land/cch/CFBASE30Y20180615/cchN51121_0111.csv': No such file or directory
> Error in `colnames<-`(`*tmp*`, value = c("Year", "Month", "Day", "Hour",  :
  attempt to set 'colnames' on an object with less than two dimensions

As you can see -- it worked well for the aop (which I had exported). It failed catastrophically when it did not reach the next file, since no export had yet been made. I am on the fence about whether or not we should make it proceed by being OK with missing wdm files, or if we should make it fail with a somewhat more descriptive message. In reality, if we don't have all the wdm's, there's a problem, and we shouldn't just export keep on going as if everything is OK. So, for now, let's get the mass quick_wdm export routine going, and we'll shelve this decision point until later under the assumption that catastrophic failure is the right solution :).

Nice work!!

rburghol commented 5 years ago

Yup! The piping was the exact syntax that I needed

Fantastic! Looking forward to more. I have, for what it's worth, tested my new model widget with your export script for A51121 in the phase 5.3.2 that you all ran last year against a 5.3 version from 2014 or so. I configured each version with a 640 acre forest only land use (640 acres is 1 square mile). I ran them side by side for all 1984-2005, and there was a substantial difference in mean flow, 1.05 cfs for the new p5.3.2 version and 0.90 for the old 5.3 version. Gonna have to go over this with a fine toothed comb once we get started in earnest. I do think that there was a change in the precipitation inputs between 5.3 and 5.3.2, but I am not 100% certain. For now, we will keep this in our mind -- another GREAT opportunity for us to apply the comparative analysis tools that you guys are developing.

hdaniel7 commented 5 years ago

Updates have been made to the function which exports all land uses, now uploaded as "openmi-om/R/utils/fn.land.use.wdm.export.all.R". Testing of the function is as follows: fn1 fn2 The function works for both phase 5 and phase 6 land use generation. Also working (but not shown) is the generation of land use .wdms for scenario CBASE1808L55CY55R45P50R45P50Y, between the years of 1990 and 2000.

Next step -- I'll get the land use eos all script to output more descriptive errors when catastrophically failing when the csvs have not yet been exported.

hdaniel7 commented 5 years ago

The function now intentionally stops and prints a more descriptive error message if a land use file is missing -- attached is an image of what is printed if the files are missing. land2 Let me know if you have any suggestions of anything else to specifically be stated in the error message.

If land use files are present, the function runs normally -- attached is an image of how that should look. land1

rburghol commented 5 years ago

This looks great. Lets test this out by exporting all the runoff files for the Craig Creek segments for both current and climate change scenarios on deq2.

hdaniel7 commented 5 years ago

Sounds good, I'll get on that now -- for reference, Craig Creek river segment JU3_7490_7400 envelops/overlaps with phase 5 land segments ( A51045 A51071 F51045 F51071 A51121 A51161 F51121 F51161 A51023 F51023 ) -- I'm quite confident phase 5 A segments are the equivalent of phase 6 N segments and phase 5 F segments are the equivalent of phase 6 H segments, so I'll export runoff files for land segments ( N51045 N51071 H51045 H51071 N51121 N51161 H51121 H51161 N51023 H51023 ) for the current and climate change scenario.

rburghol commented 5 years ago

Sounds like a plan. For future reference on matching the land segments with their appropriate river segments, we need to use GIS to find all land segments whose centroid is contained by a given river segment, or alternatively we use the mapping of land to river segments that the CBP model contains (it's in there somewhere). This will insure that we get the right corresponding pieces each and every time. And, it's something that you all will need to do when pushing some of the REST stuff per land use.

hdaniel7 commented 5 years ago

For both scenario CFBASE30Y20180615 and CBASE1808L55CY55R45P50R45P50Y -- successfully generated land use files for segments (N51045, N51071, H51045, N51121, N51161, N51023, H51023) It failed to generate files for (H51071, H51121, H51161) -- I know this because there are no hourly or annual averages printed to the command line. After a bit of investigation, this is because the land use wdms for H51071, H51121, and H51161 don't exist for either scenario -- is this an issue? In the meantime I'll update the script to break and print something like "ERROR: land use wdm does not exist" for when this issue is encountered in the future.

example code:

source('~/openmi-om/R/utils/fn.land.use.wdm.export.all.R')
land.use.wdm.export.all('N51045','/opt/model/p6/p6_gb604','CFBASE30Y20180615', '1984', '2014')
source('~/openmi-om/R/utils/fn.land.use.eos.all.R')
land.use.eos.all('N51045', '/opt/model/p6/p6_gb604', 'CFBASE30Y20180615', '/opt/model/p6/p6_gb604/out/land/CFBASE30Y20180615/eos')

hdaniel7 commented 5 years ago

About how I linked land segments to river segments -- I used the command cat /opt/model/p53/p532c-sova/config/seglists/JU3_7490_7400.land to determine which land segments this river segment was linked to in the phase 5 model and then converted them to phase 6 syntax. I tried the command cat /opt/model/p6/p6_gb604/config/seglists/JU3_7490_7400.land to determine which land segments the river segment was linked to in the phase 6 model directory -- however, there is no "config" folder within the p6_gb604, so I cannot find this linkage.

hdaniel7 commented 4 years ago

I created a quick R script this morning stored in openmi-om/R/utils/batch.land.use.R to simplify the generation of land use files -- a quick example of how to use it from the deq2 server is as follows:

source('~/openmi-om/R/utils/batch.land.use.R')
batch.land.use(c('L51157', 'L51163'), 'CFBASE30Y20180615', '1984', '2014')

you can pass a vector of land segment names of any length as the first argument -- the script will create .csv files for each land use for the land segment, combine them into a table, move the table into the /out/land/eos directory, and delete the generated .csv files for each land use. It takes roughly 20 minutes per land segment -- I'll be using this script in the coming days to generate the land segment files.

rburghol commented 4 years ago

This is cool -- checking it out right now!

On Thu, Sep 26, 2019 at 2:47 PM hdaniel7 notifications@github.com wrote:

I created a quick R script this morning stored in openmi-om/R/utils/batch.land.use.R to simplify the generation of land use files -- a quick example of how to use it from the deq2 server is as follows:

source('~/openmi-om/R/utils/batch.land.use.R') batch.land.use(c('L51157', 'L51163'), 'CFBASE30Y20180615', '1984', '2014')

you can pass a vector of land segment names of any length as the first argument -- the script will create .csv files for each land use for the land segment, combine them into a table, move the table into the /out/land/eos directory, and delete the generated .csv files for each land use. It takes roughly 20 minutes per land segment -- I'll be using this script in the coming days to generate the land segment files.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HARPgroup/openmi-om/issues/10?email_source=notifications&email_token=ABC4AIWVDA37E2G2PHHGUHDQLT7TDA5CNFSM4GJETLF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7WSVNI#issuecomment-535636661, or mute the thread https://github.com/notifications/unsubscribe-auth/ABC4AIQP23EHKRFGYE6XHPLQLT7TDANCNFSM4GJETLFQ .

--

Robert W. Burgholzer 'Making the simple complicated is commonplace; making the complicated simple, awesomely simple, that's creativity.' - Charles Mingus

hdaniel7 commented 4 years ago

Hey all -- I just finished generating the edge-of-stream unit-flow-by-land-use files for each land segment for both the base and climate change scenarios.

Base scenarios are located at http://deq2.bse.vt.edu/p6/p6_gb604/out/land/CFBASE30Y20180615/eos/

Climate change scenarios are located at http://deq2.bse.vt.edu/p6/p6_gb604/out/land/CBASE1808L55CY55R45P50R45P50Y/eos/

I checked to ensure that there are 170 land segment files for each scenario -- so I am quite positive that all the data is generated and ready to go.

HARPgroup / openmi-om

Landuse + CBP flow widget #10

--