USGS-R / protoloads

Prototyping and exploring options for broad-scale load forecasting
0 stars 4 forks source link

QC the nitrate data #37

Closed aappling-usgs closed 6 years ago

aappling-usgs commented 6 years ago

As a first cut, I'm noticing that the grab and sensor samples are off by possibly an order of magnitude:

image

image

image

aappling-usgs commented 6 years ago

Also getting this warning message, I think just with the 2nd and 3rd sites:

Warning message:
In EGRET::compressData(.) :
  Deleted 1642 rows of data because concentration was reported as 0.0, the program is unable to interpret that result and is therefore deleting it.
aappling-usgs commented 6 years ago

oh, nope, that warning was just because i had the value and comment columns switched.

jzwart commented 6 years ago

Is the switching of value and comment columns a reason why the grab vs. sensor are so far off or is that real?

aappling-usgs commented 6 years ago

i'm pretty sure that's real. I looked at the duplicated days, too, for the first site - grabs were < 0.04, sensor was 0.42ish

jzwart commented 6 years ago

hmm.. parm_cd help says the units are the same, mg/L as N. When I checked the data from the nwis.rds file, I get same order of magnitude:

summary(d$nitrate_grab$result_va) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.010 1.020 1.600 2.704 3.808 15.000

summary(d$nitrate_sensor$result_va) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 0.000 1.080 2.870 3.749 6.190 30.700 3

aappling-usgs commented 6 years ago

oh, cool, that's good news! means it's probably something fixable

aappling-usgs commented 6 years ago

Confirming (again, after repulling) that nwis.rds looks fine with respect to sample-sensor agreement:

d <- readRDS('1_data/out/nwis.rds')
ggplot(d$nitrate_sensor, aes(x=dateTime, y=result_va)) + geom_line(color='blue') + geom_point(data=d$nitrate_grab, color='red', aes(x=startDateTime), size=0.3) + facet_grid(site_no ~ .) + theme_classic()

image

aappling-usgs commented 6 years ago

And the aggregated data now looks just fine to me, too. I think maybe I was somehow only seeing or plotting the censored dates (light blue) at the top of this issue.

agg_nwis <- readRDS('2_munge/out/agg_nwis.rds')
ggplot(agg_nwis$nitrate_sensor, aes(x=date, y=daily_mean, shape=daily_cd)) + geom_line(color='blue') + geom_point(data=agg_nwis$nitrate_grab, size=1, aes(color=daily_cd)) + facet_grid(site_no ~ ., scales='free_y') + theme_classic()

image

ggplot(inner_join(agg_nwis$nitrate_sensor, agg_nwis$nitrate_grab, by=c('site_no','date')), aes(x=daily_mean.x, y=daily_mean.y)) + geom_abline() + geom_point() + facet_wrap(~site_no, scales='free') + theme_classic() + xlab('Sensor NO3-') + ylab('Grab Sample NO3-')

image

aappling-usgs commented 6 years ago

I've limited this title to nitrate data only because I think we can close that piece.