HARPgroup / HARParchive

This repo houses HARP code development items, resources, and intermediate work products.
1 stars 0 forks source link

create `summarize_river.R` based on old river summary script to take a CSV as argument #556

Open rburghol opened 2 years ago

rburghol commented 2 years ago

Desired actions for summarize_river.R

Script to use: https://github.com/HARPgroup/om/blob/master/R/summarize/waterSupplyModelNode.R

summarize_values.R Use: summarize_values.R [riverseg] [scenario] [hydr_path] [model_version] [json_dir(?)] Example: Rscript HARP-2022-Summer/AutomatedScripts/summarize_values.R JU4_7260_7380 hsp2_2022 /media/model/p6/out/river/hsp2_2022/hydr/JU4_7260_7380_hydrd_wy.csv cbp-6.0 /media/model/p6/out/river/hsp2_2022/json/

summarize_figures.R Use: summarize_figures.R [riverseg] [scenario] [hydr_path] [model_version] [image_dir] [json path] Example: Rscript HARP-2022-Summer/AutomatedScripts/summarize_figures.R JU4_7260_7380 hsp2_2022 /media/model/p6/out/river/hsp2_2022/hydr/JU4_7260_7380_hydrd_wy.csv cbp-6.0 /media/model/p6/out/river/hsp2_2022/images/ /media/model/p6/out/river/hsp2_2022/json/JU4_7260_7380_summ.json

To finalize this script:

juliabruneau commented 2 years ago

@rburghol

I guess I'm just confused on what the goal for this script is even after looking at the old script linked to this issue.

rburghol commented 2 years ago

@juliabruneau The goal for the script is to create a version of this old script that behaves just like the new scripts you all have already made.

Essentially you'll be Making it behave exactly like this this: HARP-2022-Summer/AutomatedScripts/hsp_hydr_analysis.R

Now, one might argue that you could just move all of the things from the old summarize script to hsp_hydr_analysis.R -- and honestly I would not object to that, but we need to change the name of the script.

And I also believe that the new script that you all did may use some of the more sophisticated data source functions (adding properties with ds ) that I would like to preserve.

glenncampagna commented 2 years ago

Laying out Variables from watersupplyModelNode.R

juliabruneau commented 2 years ago

Update

We decided to base summarize_river.R heavily on how waterSupplyModelNode.R is set up, but we made some tweaks to make it clearer for us. The general steps are as follows:

  1. Call in data with arguments
  2. Set up data source and entity type in order to push to VAHydro
  3. Perform analysis on Qout, wd, and ps 3.1 Push to VAHydro
  4. Calculate metrics that require 'Zoo' (IHA) 4.1 Push to VAHydro
  5. Calculate metrics with a trimmed climate change timescale 5.1 Push to VAHydro

All the changes so far are pushed to GitHUB, with information commented at each step.

@rburghol Could you explain why the timeseries is trimmed (step 5) to the "climate change scenario timescale"? Is there specific analysis that is done later on climate change effects on the river segments?

glenncampagna commented 2 years ago

1st version is ready for testing, I was stalled by a permission error. Someone else can try

Here is the example command: Rscript ~/HARParchive/HARP-2022-Summer/AutomatedScripts/summarize_river.R JL1_6770_6850 hsp2_2022 /media/model/p532/out/river/hsp2_2022/hydr/JL1_6770_685_hydr.csv cbp-5.3.2 General Use: summarize_river.R [riverseg] [scenario] [hydr_path] [model_version] This script does not incorporate any figures or any 'trimmed' variables since we will have a separate figure analysis script and we are unsure about the trimmed values/metrics (for period 1990 - 2000). Are they still desired? This script can be used/tested on any hydr file that has been converted and has Qout, date cols, and recently implemented wd_mgd

juliabruneau commented 2 years ago

Successful Trial

I ran the river segment: PL3_5250_0001

Use: Rscript ~/HARParchive/HARP-2022-Summer/AutomatedScripts/summarize_river.R PL3_5250_0001 hsp2_2022 /media/model/p532/out/river/hsp2_2022/hydr/PL3_5250_0001_hydr.csv cbp-5.3.2

Results: http://deq1.bse.vt.edu/d.dh/om-model-info/6859600

image image

NEXT STEPS

I tried including the trimmed low flow values, but when I test the if loop in R, I don't get any values (but no errors?). This is the error I got in the terminal:

Error in vahydro_post_metric_to_scenprop(model_scenario$pid, "om_class_Constant",  :
  object 'l90_Qout_trim' not found
Execution halted

This makes me believe that something within the if-loop is not working correctly.

Everything ran fine up until this point.

juliabruneau commented 2 years ago

Next steps

To do:

juliabruneau commented 2 years ago

Start on summarize_river_figures.R

The main loop of the figure generation script is set by imp_off:

After this point, these figures are generated:

image

Reasons why some plots will not give meaningful results:

We set the filenames and url's, so that our next step is to test if these plots will be exported in the right place.

glenncampagna commented 2 years ago
juliabruneau commented 2 years ago

Update (new script: summarize_river_values.R)

image image

In order to start testing pushing the figures into VAhydro:

[...calculations]

values <- list(hydr$index, hydr$date, hydr$hour, hydr$day, hydr$month, hydr$year, hydr$Qout, hydr$Qbaseline, hydr$wd_mgd, hydr$wd_cumulative_mgd, hydr$ps_mgd, hydr$ps_cumulative_mgd, hydr$ps_nextdown_mgd, hydr$net_consumption_mgd, hydr$unmet_demand_mgd) names(values) <- c("index", "date", "hour", "day", "month", 'year', "Qout", "Qbaseline", "wd_mgd", "wd_cumulative_mgd", "ps_mgd", "ps_cumulative_mgd", "ps_nextdown_mgd", "net_consumption_mgd", "unmet_demand_mgd")

values2 <- list(l90_Qout, l90_year, l30_Qout, l30_year, imp_off) names(values2) <- c("l90_Qout", "l90_year", "l30_Qout", "l30_year","imp_off")

values3 <- list(values, values2)

return(values3) }


- This will output all the required calculations from the summarize_river script to be used to generate figures
- Output (as of now):

![image](https://user-images.githubusercontent.com/104520318/199523215-e81b26b4-29dc-44c0-9317-e41e2222ef4d.png)

 - Struggles so far was realizing that functions only return one "value", which is why the different length values are outputted as a list for now
  - Still have to figure out how to call the values from the list - should I convert them a data frame (and individual values) **OR** edit the script when calling for different values (hesitant on the latter since we don't know if this is a permanent solution)

NEXT STEPS:

- [x] Plot figures from the values that were generated with the function
- [x] Push figures to VAHydro
glenncampagna commented 2 years ago

Adding unmet_demand_mgd

hydr$unmet_demand_mgd = as.numeric(hydr$demand_mgd) - as.numeric(hydr$wd_mgd) We didn't previously have unmet demand as a column in our hydr table so I created this line to generate it.. is it correct?

juliabruneau commented 2 years ago

Update

Notable Changes:

Next steps:

glenncampagna commented 2 years ago

Testing merged script (metrics and figures) on the deq server


This does confirm that the error occurs when the original zoo for the whole hydr table is created:
`hydr <- zoo(hydr, order.by = hydr$date)`
Asking myself: why is the terminal R seeing duplicates while R studio does not?
Manual testing in terminal R results in successfully creating the zoo with the same hydr csv file ..

**Solution:** `hydr <- zoo(hydr, order.by = as.Date(hydr$Group.1))` , where `Group.1` is a date added by the aggregation
- Exports to VAhydro from testing seg OR1_7700_7980 located @ http://deq1.bse.vt.edu/d.dh/om-model-info/6853355 
  Note: 'empty' plots don't show up in VAhydro (that compared wd and ps which don't exist for this seg)
juliabruneau commented 1 year ago

Separating the script using JSON

The two scripts are now summarize_values.R and summarize_figures.R

glenncampagna commented 1 year ago
juliabruneau commented 1 year ago

Work Session after 12/9 Meeting

Here is the conclusion on the workflow we agreed on:

  1. hsp_hydr_conversion.R
    • Generates all columns based on wd and ps (some will be moved from summarize_values.R)
    • Saves a normal hydr.csv with all additional columns
    • Trims to daily data
    • Trims to a water year with a buffer
    • Saves this into a csv: hydrd_wy.csv (?)
  2. summarize_values.R
    • Performs summaries: l90, l30 …
    • Pushes values to VAhydro
    • Creates a json file with values needed for figure generation
    • Unique file name - [riverseg]_summ.json
  3. summarize_figures.R
    • Calls in daily hydr file in the water year, and the json file
    • Generates and pushes figures to VAhydro

Resulting file from hsp_hydr_conversion: [riverseg]_hydrd_wy.csv

> head(hydrd)
           index               DEP       IVOL      O1 O2 O3        OVOL1 OVOL2 OVOL3     PRSUPY    RO        ROVOL     SAREA     TAU        USTAR     VOL       VOLEV      
1984-10-01 1984-10-01 11:30:00 1.2943616 5.8630603 0  0  53.200393 0     0     4.3632592 0.3510285 53.200393 4.3632592 175.80083 0.11918396 0.2479337 213.23919 0.033429171
1984-10-02 1984-10-02 11:30:00 1.3006436 4.0104952 0  0  53.633711 0     0     4.4425998 0         53.633711 4.4425998 175.90797 0.11972911 0.2485568 214.31405 0.030532922
1984-10-03 1984-10-03 11:30:00 1.2381018 3.5885918 0  0  47.862436 0     0     3.9644563 0         47.862436 3.9644563 174.84134 0.11446638 0.2430337 203.34863 0.032575537
1984-10-04 1984-10-04 11:30:00 1.1915302 3.3681695 0  0  43.587347 0     0     3.6083385 0         43.587347 3.6083385 174.04708 0.11052086 0.2388105 195.22596 0.039316895
1984-10-05 1984-10-05 11:30:00 1.1577942 3.2100393 0  0  41.062119 0     0     3.3966015 0         41.062119 3.3966015 173.47173 0.10764766 0.2356864 189.36554 0.044858515
1984-10-06 1984-10-06 11:30:00 1.1257915 3.0716329 0  0  39.394389 0     0     3.2585724 0         39.394389 3.2585724 172.92593 0.10491024 0.2326703 183.82426 0.041597132
           date       hour day month year divr_cfs diva_cfs ps_afd Qout      wd_mgd ps_mgd demand_mgd Qbaseline wd_imp_child_mgd wd_cumulative_mgd ps_cumulative_mgd ps_nextdown_mgd
1984-10-01 1984-10-01 11.5 1   10    1984 0        0        0      52.795436 0      0      0          52.795436 0                0                 0                 0              
1984-10-02 1984-10-02 11.5 2   10    1984 0        0        0      53.755458 0      0      0          53.755458 0                0                 0                 0              
1984-10-03 1984-10-03 11.5 3   10    1984 0        0        0      47.969921 0      0      0          47.969921 0                0                 0                 0              
1984-10-04 1984-10-04 11.5 4   10    1984 0        0        0      43.660896 0      0      0          43.660896 0                0                 0                 0              
1984-10-05 1984-10-05 11.5 5   10    1984 0        0        0      41.098878 0      0      0          41.098878 0                0                 0                 0              
1984-10-06 1984-10-06 11.5 6   10    1984 0        0        0      39.428726 0      0      0          39.428726 0                0                 0                 0              
> 
  1. Use: hsp_hydr_conversion.R [filename] Example: Rscript hsp_hydr_conversion.R JL1_6770_6850_hydr.csv

    • Output: JL1_6770_6850_hydr.csv and JL1_6770_6850_hydrd_wy.csv
  2. Use: summarize_values.R [riverseg] [scenario] [hydrd_wy.csv path] [model_version] [json_file] Example: Rscript summarize_values.R JL1_6770_6850 hsp2_2022 /media/model/p532/out/river/hsp2_2022/hydr/JL1_6770_6850_hydrr_wy.csv cbp-6.0 /media/model/p532/out/river/hsp2_2022/json/

    • Output: JL1_6770_6850_summ.json and summary values pushed to VAHydro
  3. Use: summarize_values.R [riverseg] [scenario] [hydrd_wy path] [model_version] [image_dir] [json_file] Example: Rscript summarize_figures.R JL1_6770_6850 hsp2_2022 /media/model/p532/out/river/hsp2_2022/hydr/JL1_6770_6850_hydrd_wy.csv cbp-6.0 /media/model/p532/out/river/hsp2_2022/images/ /media/model/p532/out/river/hsp2_2022/json/JL1_6770_6850_summ.json

    • Output: figures pushed to VAHydro

      Next Steps

    • [x] Making sure the conversion script works in the terminal (couldn't test due to permission issue - Julia)
    • [x] Testing all 3 scripts back to back in the terminal; see if everything gets pushed to VAHydro
    • [ ] Testing all 3 scripts in the meta model framework
juliabruneau commented 1 year ago

Update - debugging

These are the working arguments for the 2 summary scripts:

summarize_values: Rscript ~/HARParchive/HARP-2022-Summer/AutomatedScripts/summarize_values.R JL1_6770_6850 hsp2_2022 /media/model/p532/out/river/hsp2_2022/hydr/JL1_6770_6850_hydrd.csv cbp-5.3.2 /media/model/p532/out/river/hsp2_2022/hydr/

summarize_figures: Rscript ~/HARParchive/HARP-2022-Summer/AutomatedScripts/summarize_figures.R JL1_6770_6850 hsp2_2022 /media/model/p532/out/river/hsp2_2022/hydr/JL1_6770_6850_hydrd.csv cbp-5.3.2 /media/model/p532/out/river/hsp2_2022/images/ /media/model/p532/out/river/hsp2_2022/hydr/JL1_6770_6850_summ.json

In order to read the json file, the file name had to be extracted. It is in a format so that it will select the last element of the list - this way it will work universally.

json_split <- strsplit(json_dir, split = '/')
last_element <- as.numeric(length(json_split[[1]]))
json_file <- json_split[[1]][[last_element]]  # selecting just the json file name
        Selects the last element of the list
values <- unserializeJSON(readLines(json_file))

New Error with figure script:

juliabruneau commented 1 year ago

Debugging

Plotting critical flow periods Error in attr(x, "tsp") <- c(1, NROW(x), 1) : invalid time series parameters specified Calls: window -> window.default -> hasTsp Execution halted

The error occurs when trying to start the if-loop that will generate the figures.

message("Plotting critical flow periods")
# does this have an active impoundment sub-comp
if (imp_off == 0) {

imp_off is read from the new json file, which is a list:

imp_off <- as.numeric(values[[1]])
l90_year <- as.numeric(values[[2]])

When looking for the attributes of imp_off:

> attributes(imp_off)
NULL

$l90_year [1] 2019

imp_off <- as.numeric(values[[1]]) l90_year <- as.numeric(values[[2]]) imp_off [1] 1


### Update

- I don't think the error is in recognizing the imp_off as "1", since my test in the terminal R worked:

if(imp_off == 0){

  • print("zero")
  • } else {
  • print("one")
  • } [1] "one"
### Found the location of the error:

hydrpd <- window(hydr, start = pdstart, end = pdend)


- This is causing the error
- Fixed by setting the data as 'zoo' again. For some reason, the data read from the directory is not automatically kept as a zoo.
juliabruneau commented 1 year ago

Error with ylim and cbind

When testing in local and terminal R, there are no errors. But when running the R script with arguments, there is an error with setting the ylim with cbind:

Error in plot.window(...) : need finite 'ylim' values
Calls: plot -> plot -> plot.default -> localWindow -> plot.window
In addition: Warning message:
In .Primitive("max")(numeric(0), na.rm = TRUE) :
  no non-missing arguments to max; returning -Inf
Execution halted

The line of script where the error occurs:

ymx <- as.numeric(max(cbind(hydrpd$Qbaseline, hydrpd$Qout)), na.rm = TRUE)
  plot(
    as.numeric(hydrpd$Qbaseline), ylim = c(0,ymx), xlim=c(xmn,xmx),  #Placeholders for xlim, come back to this and create xlim based on hydrpd
    ylab="Flow/WD/PS (cfs)",
    xlab=paste("Lowest 90 Day Flow Period",pdstart,"to",pdend)
  )
juliabruneau commented 1 year ago

Progress Update

glenncampagna commented 1 year ago

Test

Rscript HARP-2022-Summer/AutomatedScripts/summarize_values.R JU3_6650_7300 hsp2_2022 /media/model/p6/out/river/hsp2_2022/hydr/JU3_6650_7300_hydrd_wy.csv cbp-6.0 /media/model/p6/out/river/hsp2_2022/hydr/

Rscript HARP-2022-Summer/AutomatedScripts/summarize_figures.R JU3_6650_7300 hsp2_2022 /media/model/p6/out/river/hsp2_2022/hydr/JU3_6650_7300_hydrd_wy.csv cbp-6.0 /media/model/p6/out/river/hsp2_2022/images/ /media/model/p6/out/river/hsp2_2022/hydr/JU3_6650_7300_summ.json

Output:

Error in file(con, "r") : cannot open the connection
Calls: unserializeJSON -> unpack -> parseJSON -> readLines -> file
In addition: Warning message:
In file(con, "r") :
  cannot open file 'JU3_6650_7300_summ.json': No such file or directory
Execution halted

Testing in terminal R (summarize_values.R):

river_seg <- 'JL1_6770_6850'
scenario_name <- 'hsp2_2022'
hydr_file_path <- '/media/model/p6/out/river/hsp2_2022/hydr/JL1_6770_6850_hydrd_wy.csv'
model_version <- 'cbp-6.0'
json_dir <- '/media/model/p6/out/river/hsp2_2022/hydr/'

Turns out to be a permission issue = we aren't allowed to write the json file into the folder:

> write(values_json, file= paste0(json_dir, river_seg, "_summ.json"), sep = ",")
Error in file(file, ifelse(append, "a", "w")) :
  cannot open the connection
In addition: Warning message:
In file(file, ifelse(append, "a", "w")) :
  cannot open file '/media/model/p6/out/river/hsp2_2022/hydr/JL1_6770_6850_summ.json': Permission denied

If @glenncampagna wants to test this to see if it is a permission error for him as well, it should verify the source of this error.

jdkleiner commented 1 year ago

Spent time looking into the elfgen_huc() item.

###############################################

RSEG ELFGEN

###############################################

GET RSEG HYDROID FROM RSEG MODEL PID

rseg <- RomProperty$new(ds, list(pid=model$pid), TRUE) rseg_hydroid<-rseg$featureid

huc_level <- 'huc8' dataset <- 'VAHydro-EDAS'

elfgen_huc(scenario_name, rseg_hydroid, huc_level, dataset, model_scenario, ds, image_dir, save_url, site)


- result will look something like: 
   - http://deq1.bse.vt.edu:81/d.dh/om-model-info/6972498
![image](https://user-images.githubusercontent.com/29379385/218560257-e3aad9d8-df0a-446e-ad6a-62dc064972aa.png)
glenncampagna commented 1 year ago

Thanks for getting back to us @jdkleiner. Errors related to plotting for us were fixed when the NAs for Qbaseline were fixed and now I can run the script till the end. I was able to get the elfgen container to show up in vahydro but did get an error at the very end of the script and the elfgen container was empty The error:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 1, 0
Calls: elfgen_huc ... as.data.frame -> as.data.frame.list -> do.call -> <Anonymous>

image image

jdkleiner commented 1 year ago

@glenncampagna Nice progress with those fixes! What segment id are you testing with that results in the error? I'll try to replicate on my end

glenncampagna commented 1 year ago

@jdkleiner I used JU4_7260_7380

jdkleiner commented 1 year ago

@glenncampagna Thanks!

Error in file(con, "r") : cannot open the connection Calls: unserializeJSON -> unpack -> parseJSON -> readLines -> file In addition: Warning message: In file(con, "r") : cannot open file 'JU4_7260_7380_summ.json': No such file or directory Execution halted

glenncampagna commented 1 year ago

@jdkleiner that is a path-related error that comes from a bug that I haven't quite had the chance to fix today, and it can be overcome by running the summarize_figures.R script from the directory where the json file is located (should be [scenario]/json/)

jdkleiner commented 1 year ago

@glenncampagna Thanks, that's just the workaround I needed to get it running.

jdkleiner commented 1 year ago

Success using jdk account:

jdk@deq2:/opt/model/HARParchive$ Rscript HARP-2022-Summer/AutomatedScripts/summarize_figures.R JU4_7260_7380 hsp2_2022 /media/model/p6/out/river/hsp2_2022/hydr/JU4_7260_7380_hydrd_wy.csv cbp-6.0 /media/model/p6/out/river/hsp2_2022/images/ /media/model/p6/out/river/hsp2_2022/json/JU4_7260_7380_summ.json
glenncampagna commented 1 year ago

Moving/renaming final version(s)

summarize_values.R moved into hsp_hydr_analysis.R summarize_figures.R renamed as hsp_hydr_plots.R