CalCOFI / calcofi4r

R package of data wrangling and visualization functions for working with CalCOFI data
https://calcofi.io/calcofi4r
Other
0 stars 1 forks source link

document functions and data #1

Closed bbest closed 2 years ago

bbest commented 2 years ago

So far we've got some functions and data:

https://calcofi.github.io/calcofi4r/reference/index.html

image

As @cdobbelaere incorporates more functions and data, she could use some help with updating documentation TBD:

References

bbest commented 2 years ago

Hi @superjai, @cdobbelaere will be updating #2 with more functions and data to document.

superjai commented 2 years ago

Hey @bbest - I just wanted to double check that the datasets included here are the ones that you actually want. Bottle.rda and dic.rda don't contain oceanographic data - instead they contain metadata about the oceanographic data. I think that additional data is actually needed here and I will suggest the additional data as we go in this post.

Bottle data The bottle data info can be found here (at least for the moment - this is a soon-to-be-taken-down older version of the calcofi site, with the newer version not loading properly). On this page, two datasets are provided: the Cast table and the Bottle table. The description of both is as follows:

The Cast table contains metadata. This table includes date, time, latitude, longitude, weather, etc. for each CTD cast ever completed on a CalCOFI cruise. Each row is a unique cast, numbered sequentially/indexed by the "Cst_Cnt" column.

The Bottle table contains oceanographic data. This table includes oceanographic measurements for each bottle/sampling depth ever completed on a CalCOFI cruise. There are additional data code and precision columns describing the quality of each oceanographic measurement. Each row is a unique bottle/sampling depth, numbered sequentially/indexed by the "Btl_Cnt" column.

Based upon the columns in the Bottle.rda file, this file is actually the cast table. I think we want the bottle table instead. Or at least both the cast and bottle tables.

DIC data The info about the DIC data can found here. The columns in the dic.rda file don't match those listed on this page as the DIC column headers. I am not exactly sure what dic.rda actually is - I can't find anything that matches it on the calcofi site - but it appears to be metadata only (similar to the cast table for the bottle data). I am thinking that we want the actual DIC data instead (or at least in addition), which is provided at the link at the top of this section.

Thoughts?

bbest commented 2 years ago
devtools::load_all()
names(bottle)
  [1] "Cst_Cnt"             "Cruise_ID"           "Cruise"             
  [4] "Cruz_Sta"            "DbSta_ID"            "Cast_ID"            
  [7] "Sta_ID"              "Sta_ID_line"         "Sta_ID_station"     
 [10] "Quarter"             "Sta_Code"            "Distance"           
 [13] "Date"                "Year"                "Month"              
 [16] "Julian_Date"         "Julian_Day"          "Time"               
 [19] "Lat_Dec"             "Lat_Deg"             "Lat_Min"            
 [22] "Lat_Hem"             "Lon_Dec"             "Lon_Deg"            
 [25] "Lon_Min"             "Lon_Hem"             "Rpt_Line"           
 [28] "St_Line"             "Ac_Line"             "Rpt_Sta"            
 [31] "St_Station"          "Ac_Sta"              "Bottom_D"           
 [34] "Secchi"              "ForelU"              "Ship_Name"          
 [37] "Ship_Code"           "Data_Type"           "Order_Occ"          
 [40] "Event_Num"           "Cruz_Leg"            "Orig_Sta_ID"        
 [43] "Data_Or"             "Cruz_Num"            "IntChl"             
 [46] "IntC14"              "Inc_Str"             "Inc_End"            
 [49] "PST_LAN"             "Civil_T"             "TimeZone"           
 [52] "Wave_Dir"            "Wave_Ht"             "Wave_Prd"           
 [55] "Wind_Dir"            "Wind_Spd"            "Barometer"          
 [58] "Dry_T"               "Wet_T"               "Wea"                
 [61] "Cloud_Typ"           "Cloud_Amt"           "Visibility"         
 [64] "Btl_Cnt"             "Depth_ID"            "Depthm"             
 [67] "T_degC"              "Salnty"              "O2ml_L"             
 [70] "STheta"              "O2Sat"               "Oxy_µmol/Kg"        
 [73] "BtlNum"              "RecInd"              "T_prec"             
 [76] "T_qual"              "S_prec"              "S_qual"             
 [79] "P_qual"              "O_qual"              "SThtaq"             
 [82] "O2Satq"              "ChlorA"              "Chlqua"             
 [85] "Phaeop"              "Phaqua"              "PO4uM"              
 [88] "PO4q"                "SiO3uM"              "SiO3qu"             
 [91] "NO2uM"               "NO2q"                "NO3uM"              
 [94] "NO3q"                "NH3uM"               "NH3q"               
 [97] "C14As1"              "C14A1p"              "C14A1q"             
[100] "C14As2"              "C14A2p"              "C14A2q"             
[103] "DarkAs"              "DarkAp"              "DarkAq"             
[106] "MeanAs"              "MeanAp"              "MeanAq"             
[109] "IncTim"              "LightP"              "R_Depth"            
[112] "R_TEMP"              "R_Sal"               "R_DYNHT"            
[115] "R_Nuts"              "R_Oxy_µmol/Kg"       "DIC1"               
[118] "DIC2"                "TA1"                 "TA2"                
[121] "pH1"                 "pH2"                 "DIC Quality Comment"

Then in R/data.R populate fields like:

#' @format TBD...e.g., A data frame with 53940 rows and 10 variables:
#' \describe{
#'   \item{price}{price, in US dollars}
#'   \item{carat}{weight of the diamond, in carats}
#'   ...
#' }
superjai commented 2 years ago

Hey @bbest. Running into some issues with the documentation - would appreciate your insights.

  1. If you check out the data.R file (which contains the data documentation), you'll see that there are huge introductory sections for the bottle and dic datasets. Unfortunately, only the first paragraph of each section is actually included in the resulting html page (example here). I tried using the @description and @details tags with the bottle dataset description, in hopes that the use of @details would allow the extra info to be included - no dice though.
  2. The Data_Or field in the Bottle and DIC datasets is not documented. Unfortunately, the meaning of this field has been left blank on the webpage that describes the CalCOFI fields. I have emailed Renae Logston (Data Analyst at Scripps) about this - will fill in when I get a response.
superjai commented 2 years ago

It turns out that using @details did allow the extra info to be included (just in a location that I wasn't expecting). I improved the formatting and rearranged the ordering of information in data.R to match the ordering of information in the resulting HTML files.