lter / lterwg-som

Soil Organic Matter Synthesis working group
https://lter.github.io/som-website/
8 stars 6 forks source link

NutNet join script duplicated chemistry values across years #70

Closed piersond closed 5 years ago

piersond commented 5 years ago

Need to fix NutNet join script to properly carry over chemistry (e.g. lyr_soc) for each sample year. Right now, the values for all chem analytes (columns) are duplicated across years.

piersond commented 5 years ago

@srearl @wwieder The raw provided data files I've found in the zip files for NutNet (e.g "comb-by-plot-clim-soil-diversity-02-Aug-2019.csv") has the duplicated soil analyte data across years by plot, suggesting this is likely what we received from the data providers and not a subsequent script error. Metadata for the csv file says that the "perC" column is "pre-treatment soil % Carbon by mass."

@wwieder Perhaps... 1) ...this is an error in the combine script used by the data providers? 2) ...the post-treatment percent C data simply does not exist? 3) ...post-trt soil data has been intentionally left out?

piersond commented 5 years ago

For the moment, the idea to fix this is to add on to the join script a section that removes the repeated soil data columns if year != 0. Script this at the tarball level. @piersond

piersond commented 5 years ago

Update: need to rehomog the NutNet folder. Keykey had improper column name for treatment year, I've fixed this in keykey. Ready for rehomog, will try to do this tomorrow or Friday. For after homog, I have written a rough draft of script to clean the duplicate NutNet data from the tarball. Saved in "data processing" > "keyV2 scripts" > "fixes"

wwieder commented 5 years ago

Looks like you did this rehomog, Derek? If so, do we need a new tarball to work with before you get rid of the duplicated data?

On Wed, Oct 23, 2019 at 3:43 PM Derek Pierson notifications@github.com wrote:

Update: need to rehomog the NutNet folder. Keykey had improper column name for treatment year, I've fixed this in keykey. Ready for rehomog, will try to do this tomorrow or Friday. For after homog, I have written a rough draft of script to clean the duplicate NutNet data from the tarball. Saved in "data processing" > "keyV2 scripts" > "fixes"

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lter/lterwg-som/issues/70?email_source=notifications&email_token=AB5IWJFTDCVFCJPFPM2TJPLQQDAPTA5CNFSM4JCAW3O2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECC7DEQ#issuecomment-545649042, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5IWJBODV2MPGLX73XNM6TQQDAPTANCNFSM4JCAW3OQ .

-- Will Wieder Project Scientist CGD, NCAR 303-497-1352

piersond commented 5 years ago

Yep, rehomog done and NutNet cleaner script written. Stevan's planning to get the new tarball together early this week.

srearl commented 5 years ago

tarball 2019-10-27 posted. @piersond - double-check my results, but I think that I was able to replicate your for-loop with: mutate_at(.vars = NN_columns_to_clean, .funs = ~replace(., grepl("nutnet", network, ignore.case = T) & observation_date >= tx_start, NA)); note that you had capitalized the L of fe_HCl, which I did not correct in your script on the repo.