Visualization and normalization questions

bpbond commented 5 years ago

I'm playing around with the best way to QC the computed fluxes, and wanted to get your feedback @kaizadp because it depends on what you're interested in.

fluxes_co2

fluxes_ch4

Note these are raw fluxes, not corrected for core dry weight. Is this what you'd like them normalized by?

bpbond commented 5 years ago

So two questions in this issue.

Visualization - color by assignment as above? Something else? Facet by core (messy)?
Normalization - by what?

kaizadp commented 5 years ago

Color by assignment is good, perhaps facet by soil type (soil vs. soil_sand) Normalization by dry weight for now. Because we added sand (inert, no carbon) to half the cores, I might also do a normalization by C content and compare the two. But I can do that later.

bpbond commented 5 years ago

Normalized and grouped as you suggest:

fluxes_co2

fluxes_ch4

kaizadp commented 5 years ago

Nice! I like this. The D (drying) and W (wetting) cores look different enough for the CO2.

bpbond commented 5 years ago

fluxes_co2

kaizadp commented 5 years ago

Even better. I just pushed the latest batch of data but somehow ended up merging branches? Not sure what happened, it wouldn't let me push my data until I fetched the remote repo.

It hasn't changed any of the scripts, only the input data for core_weights and subsampling_weights.

20c1d741caccd2894d80abf564e5f9427c09dcd0

bpbond commented 5 years ago

😕 Okay. Things basically look the same.

I feel like things are at a good point, in that we're close to producing what you'll need for AGU?

bpbond commented 5 years ago

> make(plan)
target core_dry_weights
target picarro_raw                                                                                                  
Target picarro_raw messages:
  Found 1687 files
target core_masses
target picarro_clean                                                                                                
Target picarro_clean messages:
  Welcome to clean_picarro_data
target valve_key
target pcm
Target pcm messages:
  Welcome to match_picarro_data
target picarro_match_count
target valve_key_match_count
target picarro_clean_matched
target ghg_fluxes
Target ghg_fluxes messages:
  Welcome to compute_fluxes
target qc1
Warning: target qc1 warnings:
  Some valve key entries were not matched
Target qc1 messages:
  95 of 110 valve key entries were matched
  7025 of 124653 Picarro data entries were matched
  Saving 7 x 7 in image
target qc2
Target qc2 messages:
  Saving 7 x 7 in image
target qc3

kaizadp commented 5 years ago

Almost, I think.

Target ghg_fluxes messages:
  Welcome to compute_fluxes
fail ghg_fluxes
Error: Target `ghg_fluxes` failed. Call `diagnose(ghg_fluxes)` for details. Error message:
  length(na.omit(time)) > 2 is not TRUE

bpbond commented 5 years ago

Re-install picarro.data from GitHub.

kaizadp commented 5 years ago

And then if I do a = readd(ghg_fluxes), a is the final file I need for analyzing the data, correct?

bpbond commented 5 years ago

a <- readd("ghg_fluxes") right.

kaizadp commented 5 years ago

The script doesn't see the latest bit of data I uploaded an hour ago (DATETIME Nov 10 onwards).

bpbond commented 5 years ago

Hmm. Quick fix: do clean() on the command line to clear out all targets and re-run the pipeline.

?

kaizadp commented 5 years ago

> clean()
Error in if (use_cache && exists0(hash, envir)) { : 
  missing value where TRUE/FALSE needed

bpbond commented 5 years ago

clean(destroy = T)
make(plan)

?

kaizadp commented 5 years ago

same error

> clean(destroy = T)
Error in if (use_cache && exists0(hash, envir)) { : 
  missing value where TRUE/FALSE needed

bpbond commented 5 years ago

😕 uh...OK seems like something is goofy. Are you OK with executing a command-line (in Terminal if on Mac)? If so from within the repository folder do:

rm -rf .drake/

If not, make sure your Git is clean (no changes), toss the repository into the trash, delete, and re-clone from GitHub.

bpbond commented 5 years ago

Or could post a question to the drake maintainer Will Landau at https://github.com/ropensci/drake/issues

kaizadp commented 5 years ago

not sure if this is the correct response

WE37250:~ pate212$ /Users/pate212/OneDrive\ -\ PNNL/Documents/GitHub/hysteresis_github rm -rf .drake/
-bash: /Users/pate212/OneDrive - PNNL/Documents/GitHub/hysteresis_github: is a directory

also tried this, just in case it needed another forward-slash

WE37250:~ pate212$ /Users/pate212/OneDrive\ -\ PNNL/Documents/GitHub/hysteresis_github/ rm -rf .drake/
-bash: /Users/pate212/OneDrive - PNNL/Documents/GitHub/hysteresis_github/: is a directory

kaizadp commented 5 years ago

the Terminal option did not work
deleting and recloning did not work

another option, of course, is:

DOING THIS WITHOUT F-ING DRAKE

... did you convert timezones? Picarro data are in UTC, but valve_key are in LosAngeles. There's a mismatch of datetime between picarro_clean_matched and valve_key.

bpbond commented 5 years ago

Argh. No not correct. Argh, f***ing OneDrive, I wonder if that's screwing things up. I wish I could sit down with you and take a look. So sorry this is giving you problems. Recloning didn't work?!?

Timezones: the Picarro is assigned UTC, and the valve file is assigned America/Los_Angeles. That should be all we need for correct operation, but if you suspect a problem, I can check.

bpbond commented 5 years ago

Yeah try cloning to a non-Onedrive place please. Somewhere that f***ing piece of software isn't mirroring.

kaizadp commented 5 years ago

Cores 26-30 ran from Nov-08 15:05 to Nov-09 15:05 USA/LA time

Cores 81-85, etc. ran from Nov-09 15:40 to Nov-10 15:55

But picarro_clean_matched shows cores 26-27 running at Nov-09 23:04, ~8 hours ahead of its recorded USA/LA time

I also don't see any time zone conversion in the scripts, 0-packages or 3-picarro. Of course, it could be embedded within something else.

kaizadp commented 5 years ago

Yeah try cloning to a non-Onedrive place please. Somewhere that f***ing piece of software isn't mirroring.

I deleted the repo and cloned to a non-OneDrive folder. I'm getting the same output. No data after Nov-10. I think it's a date-matching issue.

bpbond commented 5 years ago

OK, so you can run the pipeline--that problem is resolved? Let me look at the above, back shortly.

kaizadp commented 5 years ago

Yes, I can run the pipeline, and I get the same output graph as you above. (https://github.com/kaizadp/hysteresis/issues/8#issuecomment-552563353)

bpbond commented 5 years ago

OK, let's see. The last Picarro files ends at 2019-11-11 19:58:32.154. picarro_clean ends at 2019-11-11 19:58:32. picarro_clean_matched ends at 2019-11-09 23:04:50. The last entry in the valve_key is 155 2019-11-09 15:40:00 2019-11-10 15:55:00.

Maybe I'm tired at the moment, but not immediately seeing the problem.

bpbond commented 5 years ago

Re timestamps, there's time zone conversion:

In 1-moisture_tracking.R,

    dplyr::mutate(Start_datetime = mdy_hm(Start_datetime, tz = "America/Los_Angeles"),
                  Stop_datetime = mdy_hm(Stop_datetime, tz = "America/Los_Angeles"),

And for the Picarro data, in 3-picarro_data.R:

    clean_data(tz = "UTC") %>%

bpbond commented 5 years ago

I don't like (because it's confusing) having these issues mixed together, so opening new ones if that's okay.

kaizadp / hysteresis_and_soil_carbon

Visualization and normalization questions #8