This repository is intended to hold code, documentation, and tutorials for the USGS Consumptive Use Project run through Dr. Julie Shortridge and Dr. Durelle Scott of Virginia Tech as well as Robert Burgholzer and Joseph Kleiner of the Virginia Department of Environmental Quality
3
stars
0
forks
source link
Draft 12 month View performance when fetching all facilities for a given year #68
We seem to get a system freeze when retrieving all data for a given year at one single time, using the prototype 12 month grid with contextual filters guessing the dates for each month column, joining the timeseries table essentially 13 times (annual + 12 months). I want to test the all 12 at once with hand-written SQL to see if there is some efficiency that is lost in the drupal views process, but in the meantime, I think an alternative methodology needs to be developed in order to insure that we can deliver this data in an effective and timely manner.
Thus, I want to:
Use a version of the view that exports annual totals only for all locations.
Then create a version of the monthly that allows a single month to be retrieved for all records.
In R, loop through calling the single month 12 times, using sqldf to join the new month back into a foundation data frame using only the unique system hydroid to make the join (since we insure that we only have a single record for each anyhow).
Pseudo-code below:
monames <- list(
1 = 'jan',
2 = 'feb',
...
)
yr <- 2019
yrdata <- as.data.frame(fetch_annual(yr, 'wd_mgy'))
for (mo in c(1:12)) {
moname <- monames[mo]
modata <- as.data.frame(fetch_monthly(yr,mo))
sqlstr <- paste0(
"select a.*, b.tsvalue as ", moname,
"from yrdata as a
left outer join modata as b
on a.hydroid = b.hydroid "
)
yrdata <- sqldf(sqlstr)
}
We seem to get a system freeze when retrieving all data for a given year at one single time, using the prototype 12 month grid with contextual filters guessing the dates for each month column, joining the timeseries table essentially 13 times (annual + 12 months). I want to test the all 12 at once with hand-written SQL to see if there is some efficiency that is lost in the drupal views process, but in the meantime, I think an alternative methodology needs to be developed in order to insure that we can deliver this data in an effective and timely manner.
Thus, I want to: