Open cpenn-usgs opened 8 years ago
Any chance you could paste that code you used here? I can play with it using fake data.
dateExtended_tick.span2.pdf dateOrig_tick.span2.pdf default_tick.span.pdf
Here's a sample of the graphs for comparison, using the default (tick.span not specified), setting it to 2 with the original WY15 end date, and setting it to 2 with the extended end date. I'll paste the code next.
### R Code - Ft. Carson time-series plotting
library(smwrGraphs)
library(dataRetrieval)
library(reshape2)
library(dplyr)
library(xlsx)
#Get the data
##Get site numbers and parameters
#siteParms <- read.csv("SiteList_short_TESTING.csv",header=TRUE,colClasses="character",na.strings="")
#siteParms <- melt(siteParms,id.vars="site_no") %>%
# rename(parm_cd = value)
siteParms <- read.xlsx("Site_parameters_stds_TESTING.xlsx",sheetIndex = 1,startRow = 1,endRow = 35,header = TRUE) %>%
rename(parm_cd = ParameterCode) %>%
rename(site_no = SiteNumber)
siteParms$site_no <- as.character(siteParms$site_no)
siteParms$parm_cd <- as.character(siteParms$parm_cd)
###Get NWIS data
qwData <- readNWISqw(siteNumbers = unique(siteParms$site_no),
parameterCd = "All",
expanded=TRUE)
paramINFO <- readNWISpCode(na.omit(unique(qwData$parm_cd))) %>%
rename(parm_cd = parameter_cd)
siteINFO <- readNWISsite(unique(qwData$site_no))
qwData <- left_join(qwData,paramINFO,by="parm_cd")
qwData <- left_join(qwData,siteINFO,by="site_no")
###Subset to only wanted data - and only WS medium_cd
qwData <- left_join(siteParms,qwData,by=c("site_no","parm_cd"))
qwData <- subset.data.frame(qwData,qwData$medium_cd=="WS")
sites <- unique(qwData$site_no)
for(i in 1:length(sites))
{
subData <- qwData[qwData$site_no == sites[i],c("parm_cd","parameter_nm","Standard1","Standard2","site_no","station_nm","sample_dt","result_va","remark_cd")]
subData <- subData[!is.na(subData$sample_dt),]
subData <- subData[!is.na(subData$result_va),]
analytes <- unique(subData$parm_cd)
###Min/Max years for plot consistancy between analytes, set time plot/tick span based on years
minyr <- min(year(subData$sample_dt))
maxyr <- max(year(subData$sample_dt))
if(minyr == maxyr)
{
xlab <- c("months")
tspan <- 1
} else
{
xlab <- c("water year")
#maxyr = maxyr+1
if ((maxyr-minyr)>10)
{
tspan <- 2
}
else
{
tspan <- 1
}
}
minyr = minyr-1
mindt <-paste(minyr,"10","01",sep = "-")
maxdt <-paste(maxyr,"10","01",sep = "-")
for(j in 1:length(analytes))
{
plotData <- subData[subData$parm_cd %in% analytes[j],]
std1 <- unique(plotData$Standard1)
std2 <- unique(plotData$Standard2)
if(nrow(plotData) > 0)
{
setSweave(paste0(unique(plotData$site_no),"_",unique(plotData$parm_cd)), 6 ,3)
p1<-with(plotData, timePlot(sample_dt, result_va, Plot=list(what="points",symbol="circle",filled=TRUE,size=0.03,color="black"),
xaxis.range = as.Date(c(mindt,maxdt)),
xlabels = list(major=xlab,tick.span=tspan,style=c("at")),
ytitle = unique(parameter_nm),
caption=paste0(unique(site_no),": ",unique(station_nm))
))
addMinorTicks("y",p1)
addMinorTicks("x",p1,3)
#Add MCL/numeric standard lines
if(is.na(std1)==FALSE)
{
refLine(horizontal = std1,Plot = list(type = "solid",size = 0.03,color = "black"),current = p1)
addAnnotation(as.Date(maxdt),std1,annotation = "Numeric standard",justification = "right",size = 5,current = p1)
} else{}
if(is.na(std2)==FALSE)
{
refLine(horizontal = std2,Plot = list(type = "solid",size = 0.03,color = "black"),current = p1)
addAnnotation(as.Date(maxdt),std2,annotation = "Numeric standard 2",justification = "right",size = 5,current = p1)
} else{}
# Required call to close PDF output graphics
graphics.off()
} else {}
}
}
#move all plots to folder
system("mv *.pdf ./plots")
(side note, I put 3 backticks ` and the letter r before your code, and three backticks after to get it formatted)
Here's a little dataRetrieval trick that could save you a little time if you are making tons of queries:
###Get NWIS data
qwData <- readNWISqw(siteNumbers = unique(siteParms$site_no),
parameterCd = "All",
expanded=TRUE)
paramINFO <- attr(qwData, "variableInfo")
siteINFO <- attr(qwData, "siteInfo")
qwData <- left_join(qwData,paramINFO,by=c("parm_cd"="parameter_cd"))
qwData2 <- left_join(qwData,siteINFO,by=c("site_no","agency_cd"))
Just so I'm clear (and I probably won't get a chance to look at this today)...are you pointing out that the default is bad (which it is....) and suggesting your fix (which looks pretty nice)?
Or asking a question about the end date?
I mainly wanted to bring attention to the issue that even if you hardcode an end date, your data may be cut off if you also input a tick span. I'm running this script on 22 sites, ~2-10 parameters per site, and each site varying from 1-30 years of data. It wasn't immediately obvious to me that data was being cut off for some sites so I figured noting the issue would be good for anyone also running a script that does large data pulls producing lots of plots and may not be intimately familiar with the sites or their data. I was just talking with Joe Mills and he mentioned USGS publishing standards state that the beginning and ending of an axis must have a label, so I guess it does make sense to expand the end date, based on a hard coded end date, to the next possible major tick position, rather than cut it off at the previous tick position. I'm not much of a code developer though, so there's probably a more efficient way to do this in R than my clunky "if" statements!
I plotted a time series plot with chemistry data spanning WY1979-2015, the xaxis.range hard coded as 1978-10-01 to 2015-10-01, major label was coded as "water year". The default tick span was plotting ever year and the tick labels were overlapping. I manually set tick span to 2, which caused the date axis (x-axis) to span from the start of WY1979 to the beginning of WY2015 (2014-10-10), which cut off all 2015 data points. I hardcoded the xaxis.range to then end at 2016-10-01. This caused the date axis to span from the start of WY1979 to the beginning of WY2017 (2016-10-01), which includes the 2015 data but extends the graphic longer then needed.