IMCR-Hackathon / Hackathon-Central-2018

Command center for IMCR Hackathon participants to share ideas, coordinate teams, develop projects and access all logistics information
3 stars 0 forks source link

Detect incomplete sampling records #3

Open kcawley opened 6 years ago

kcawley commented 6 years ago

We have several protocols that have multiple tables of data including field data collected with a mobile app (mostly metadata, e.g. site, date, handheld WQ measurements of DO, pH, temp etc.), laboratory analyses entered with spreadsheet upload or a mobile app (e.g., dry mass, ash mass, or titrations), and external laboratory analyses ingested through spreadsheet upload (e.g., taxonomic identification or isotopes). With 81 sampling sites and bouts scheduled on different calendar dates, manually looking for information for each site and set of tables would be extremely time consuming. Developing a script to look for numbers of records in different tables for specific site and date combinations and providing a report would be extremely useful for tracking down data gaps, late external lab data, or missed sampling bouts as soon as possible.

jhp7e commented 6 years ago

I've seen a R package as part of a cluster of R code for science that Stephen Earl pointed me to that would be good to help with this (but I can't find the link right now - maybe it was naniar http://naniar.njtierney.com/articles/getting-started-w-naniar.html). The VIM or the MissingDataGUI packages in R might also be useful.