Closed jaxinewolfe closed 3 months ago
Resolving errors in existing hook scripts:
The following cores have a depth interval where the min and max are likely reversed:
study_id | core_id |
---|---|
Thom_1992 | PB1 |
DelVecchia_et_al_2014 | M1314 |
Costa_et_al_2023 | PanamaCaribbean-Station Forest (SF)-1 |
MacKenzie_et_al_2021 | Catanauan_216_714 |
Sharma_et_al_2021 | Koh_Kohng_138_384 |
Sharma_et_al_2021 | Koh_Kohng_138_386 |
The following studies have intervals with NA depths:
"Nahlik_and_Fennessy_2016" "Langston_et_al_2022" "De_Iongh_et_al_1995" "Agawin_et_al_1996" "Townsend_and_Fonseca_1998" "Holmer_et_al_2007" "Van_Engeland_2010" "Boyd_et_al_2017"
Studies with NA Coords:
Studies with NA Habitat:
Studies with Modeled C:
Synthesis and Post-Processing
We need a QAQC function to catch studies that have no associated citation
Some starter code:
no_citations <- ccrcn_synthesis$cores %>% filter(!(study_id %in% unique(ccrcn_synthesis$study_citations$study_id)))
if (nrow(no_citations) > 0) { warning("NOTE: The above studies were removed because they did not have citation information present. Please review the CCN library synthesis to confirm that all synthesis studies have proper study citation information in '/data/CCN_synthesis/CCN_study_citations.csv' ")
unique(no_citations$study_id) }
@cheneyr @BettsH
Here are the QAQC results for our current version of the synthesis. It looks like a lot, but I think it's fairly small stuff that we can knock out! For example, theres a bunch of columns starting "..." which resulted from tables being output using write.csv() without specifying that row.names = F (idk why the default is set to true, its annoying) to prevent it from creating a column with the row number index included. If you spot stuff that is related to datasets you've worked on you can go for those quick fixes, or we can chat about some of the more nuanced things in our meeting (or whenever).
Also, we are well past 10k cores, holy moly! 🎉
index | test | result |
---|---|---|
1 | Core ID uniqueness | Check the following core_id(s) in the core-level data: 2B, 305, 398, 399, AL, B1, B2, B3, B4, G15, G4, G5, G9 |
2 | Valid core ID links in core table | No core ID in depthseries table: WBWA1109_01PU, WBWA1109_02PU, WBWA1109_03PU, WBWA1109_04PU, NSOR1209_01PU, NSOR1209_02PU, NSOR1209_03PU, NSOR1209_04PU, NBOR1409_01PU, NBOR1409_02PU, NBOR1409_03PU, Catlett_1m, Catlett_Transect, Goodwin_1m, Goodwin_Transect, Pamunkey_Transect, SweetHall_1m, SweetHall_Transect, Taskinas_Transect |
3 | Valid core ID links in depthseries table | No core ID in core table: PB1, RC_U_A, RC_M_A, PR_U_A, PR_M_A, W_U_A, W_M_A, F_U_A, F_M_A |
4 | Test coordinate uniqueness | 1373 sets of coordinates are associated with more than one core. Check 'data/QA/duplicate_coordinates.csv' |
5 | Validity of column names in depthseries table | Undefined columns: ...33, ...38, th234_activity, th234_activity_se, k40_activity, k40_activity_se, ...56, ...57, date, pb210_crs_age, pb210_crs_age_sd |
6 | Validity of column names in cores table | Undefined columns: ...29, salinity, ...31, ecological_condition_flag, ...37, ...38, core_date, core_position_method, geomorphic_id, ...42 |
7 | Validity of column names in sites table | Passed |
8 | Validity of column names in species table | Undefined columns: ...7, ...8 |
9 | Validity of column names in impacts table | Undefined columns: impact_notes, ...6 |
10 | Validity of column names in methods table | Undefined columns: ...30, ...32, ground_or_sieved_flag, ...35, pb210_background_assumption, ...37 |
11 | Validity of column names in study_citations table | Undefined columns: keywords, day, ...20, issue, ...22, issn, abstract, eprint, ...30, ...31, article-number |
12 | Validity of variable names in depthseries table | Passed |
13 | Validity of variable names in cores table | Undefined variables: WGS84, riverine, palustrine, deltaic, brackish to fresh, brackish to saline, other, mudflat, plain, submerged subtidal |
14 | Validity of variable names in sites table | Undefined variables: palustrine |
15 | Validity of variable names in species table | Passed |
16 | Validity of variable names in impacts table | Undefined variables: managed, restoring, canalled |
17 | Validity of variable names in methods table | Undefined variables: PVC tube or thin-walled metal tube, Eijkelkamp peat core sampler, shovel, shovel core, gouge corer, polycarbonate tube, duplicate measurements, duplicate measurements, ground and sieved, not specified, not specified, not specified, selected intervals |
18 | Validity of variable names in study_citations table | Undefined variables: primary source, article |
Synthesis QAQC Checks:
Note: the following have now been added to the synthesis report output
Depthseries
Cores
Bib
@cheneyr
Thanks for renaming the citation tables! Some are still getting flagged in the synthesis QA and it looks like it's because the study_id was left out. (Though the citation table for Drake 2024 may still be missing from the derivative folder). So one more annoying edit there for the following:
[1] "Stahl_et_al_2024" "Palinkas_and_Engelhardt_2024"
[3] "Palinkas_and_Cornwell_2024" "Drake_et_al_2024"
[5] "Craft_2024"
@BettsH Could you take a look at the coordinates for Bukoski et al 2017 in the Sanderman synthesis? They were made fuzzy per the authors request, but there are a few that have ended up in Laos when they should be in Vietnam (so, a bit too fuzzy). Maybe check out the original paper or the supplementary data and/or use google earth engine to see if we can't update those so they a least get assigned the right country.
Cores in question: "M1566" "M1567" "M1568" "M1569" "M1570" "M1571" "M1572" "M1573" "M1574" "M1575" "M1576" "M1577" "M1578"
@cheneyr two tasks for you (if you haven't already done them)!
Thank you!
This update is scheduled for March 2024
Please continue working in the develop branch!
Overall Goals:
Additional goals: