Closed timvdstap closed 2 months ago
To answer the questions:
The survey.csv will not have a column for sample_ids included. This would make it more challenging than it needs to be, with lists of sample_ids in a single cell. This is not beneficial for data entry. Instead, Sampling Event Number will also be recorded within derived data sheets, which will help nest sample_ids.
Documenting the proposed data integration here for discussion.
From my understanding, there is a survey-template.csv file that will be populated with survey-specific metadata. Among which will be ids for the nutrient and chlorophyll data that is derived through lab analyses. Whether these ids will be populated as a string in a single column (
sample_ids
) or parsed out into separate columns (e.g.,nutrient_sample_id
andchl_sample_id
) is yet to be decided. For the data collected by the miniDOT and ctd-diver however, this is more continuous data and if I'm not mistaken, each timestamp will not have a unique ID. This data is connected to the survey-metadata and associated lab-derived data by proximity of data collection.In practical terms what this means is that if
sample_time_collection
associated with the nturient_sample_id and chl_sample_id overlaps, or is in close proximity to (what does this mean specifically - within 5 minutes?) to the time interval recorded by the ctd-diver or miniDOT, we can reliably say that they are part of the same survey.Questions that this raises for me is: