Open mwdunlap2004 opened 4 months ago
I pushed my function that takes in a dataset and gageid variable and outputs a weekly csv of that data, it is on the weeklyprecip branch, and it is called "attemptatweekdata".
Right now the function uses basic variable names like precip_in, do any of you know of a way to change the name of a variable using the dataset so it would look like PRISM_p_cfs for example.
Right now the function uses basic variable names like precip_in, do any of you know of a way to change the name of a variable using the dataset so it would look like PRISM_p_cfs for example.
Sure, there are several ways to create a dynamically named field in a dataframe or as an entry to a list. The simplest is likely the following, which will create a column of NAs named MyCol2 in a data.frame:
inVar <- 2
myDF[,paste0("My","Col",inVar)] <- NA
You could alternatively simply add a column and rename it based on the index of said column:
myDF$dummyColumn <- NA
names(myDF)[ names(myDF) == "dummyColumn" ] <- paste0("MyCol", inVar)
#OR
myDF$dummyColumn <- NA
names(myDF) [ grepl("dummyColumn", names(myDF)) ] <- paste0("MyCol", inVar)
However, I'm not sure we'd want more specifically named columns unless these are all being joined together. If the function is only handling one dataset at a time, it might be helpful to keep the structure of the output file generic such that we always get a data frame with the same names. This makes it easier to handle the data frame in future processing steps, regardless of the data source. In other words, it might be helpful to get a field labeled precip_in as long as it represents only one datasource. Then, we know we can use this function and always simply precip_in to get the precip data from the datasource that we specify earlier in the workflow.
I created a new version of access-file.R, called lmsingledata, which only requires one dataset, so all analysis can be run on one data source at a time. The major change was how data is pulled in.
hydrocode = paste0('usgs_ws_', gageid)
data_source = "prism"
hydro_data <- read.csv(paste0("http://deq1.bse.vt.edu:81/files/met/",
hydrocode,"-",data_source, "-all.csv"))
hydro_data[,c('yr', 'mo', 'da', 'wk')] <- cbind(year(as.Date(hydro_data$obs_date)),
month(as.Date(hydro_data$obs_date)),
day(as.Date(hydro_data$obs_date)),
week(as.Date(hydro_data$obs_date)))
if (data_source=="nldas2"){
hydro_data <- sqldf(
"select featureid, min(obs_date) as obs_date, yr, mo, da,
sum(precip_mm) as precip_mm, sum(precip_in) as precip_in
from hydro_data
group by yr, mo, da
order by yr, mo, da
"
)}
@ilonah22 the code that you pasted above looks excellent -- what it does is to create a daily summary dataframe from the raw data file. The next step is to create a second script that does almost the same thing but takes the daily CSV as input and generates a weekly CSV (which we use for some of our methods).
The only mods I would put for the script, is that rather than guessing the hydrocode and input file name (and output filename), these will be inputs to the script. The details of this script are in the issue I tagged you in over here: https://github.com/HARPgroup/model_meteorology/issues/61 -- if you can start to develop and track your progress on this over there that would be awesome. Keep me posted - thanks!
@ilonah22 I think Rob's comments are spot-on. Taking this framework you have and creating a weekly version is a great next step and will help to reinforce our workflow development. I'd be happy to help out with this as needed. I have some availability in the afternoon and can help parse through Rob's suggestion or go over some next steps. I found this workflow process to be a bit tricky at first and am happy to discuss! Just let me know and I can set-up a Teams Meeting.
http://deq1.bse.vt.edu:81/met/[data source]/out/
http://deq1.bse.vt.edu:81/met/daymet/out/
http://deq1.bse.vt.edu:81/met/PRISM/out/
http://deq1.bse.vt.edu:81/met/nldas2/out/
mon_lm
function is in repo now at: https://raw.githubusercontent.com/HARPgroup/HARParchive/master/HARP-2024-2025/functions/lm_analysis_plots.Rlibrary("R6")
source("https://raw.githubusercontent.com/HARPgroup/HARParchive/master/HARP-2024-2025/functions/lm_analysis_plots.R")
mon_lm