USEPA / Phytoplankton-Data-Analysis

Phytoplankton Data Analysis
3 stars 0 forks source link

File question: EFR_NOT_IMPORTED_CHEMICAL_2003-2012_2011_2.xlsx #22

Open mjpdenver opened 10 years ago

mjpdenver commented 10 years ago
  1. Does QC data need to be imported from the first sheet?
  2. On sheet 20102011FIELD_DATA_NOT_IMPORTED; can I have some clarification on how to construct a 24 character sample id. Specifically, it seems STATION INFO Contains date information. Is it correct to infer a station name from the first column with the first 4 characters extracted?
  3. Should the file NonConformativeName* be imported? If so, is date inferred from the last 8 characters of column B. Starting at line 682, should duplicates be read in?
jbeaulie commented 10 years ago

In Drew/j

  1. We can omit QC data.
  2. The first four characters in the first column indicate which district and lake the sample came from. These will be the first four characters of the 24 character sample id. I believe the remaining characters represent the sampling station, but they do not conform to the standard set of station IDs provided by USACE. Three of the names can be converted to the standard IDs according to the table below, but I will need to seek guidance on the others.

image

  1. No. The worksheet contains chemical concentrations in sediments. We will only be using water-column chemistry in the analysis.
mjpdenver commented 10 years ago

As discussed, in the station field we will allow more than 5 characters. This will also result with a sample ID of more than 24 characters. Creating an artificial 5 character station ID would require that we maintain another index to retain the orginal string. The advantage of using the originally reported name is that is potentially has meaning and is familiar to people who create the files.

The potentially pitfall here is that users should not re-parse the original sample ID or only do so with caution.

QC samples mentioned above will not be read in and the site name corrections will be made.

jbeaulie commented 10 years ago

We received "Corrected Site Names.xlsx" from Jade's staff on June 10, 2014. The spreadsheet indicates the correct site names.

Should we correct the site names at this point in the workflow?