Origin of this dataset is unknown. Data cleaning scripts reference http://gcmd.nasa.gov/records/GCMD_cmar_wh.html which refers to the CSIRO Marine Data Warehouse. A dataset called "csiro_warehouse" is available on OBIS with 106,513 records (metadata refers to ~106,000 records).
However, raw dataset in our repo (dataset_99.csv) has only 43,933 records with very different fields, although the extremely verbose SampleID field does reference the Marine Data Warehouse with records like "Courageous_survey_Cour031(1978)_station_no_10_extracted_from_CMAR_Data_Warehouse_on_12_Oct_2005_42.5_41_40_45". The raw dataset in our repo may be a subset of the full dataset acquired from some other source.
Since I can't track down the original location of our current dataset_99.csv, we should probably use the "csiro_warehouse" version on OBIS and re-write the data cleaning script.
Origin of this dataset is unknown. Data cleaning scripts reference http://gcmd.nasa.gov/records/GCMD_cmar_wh.html which refers to the CSIRO Marine Data Warehouse. A dataset called "csiro_warehouse" is available on OBIS with 106,513 records (metadata refers to ~106,000 records).
However, raw dataset in our repo (dataset_99.csv) has only 43,933 records with very different fields, although the extremely verbose SampleID field does reference the Marine Data Warehouse with records like "Courageous_survey_Cour031(1978)_station_no_10_extracted_from_CMAR_Data_Warehouse_on_12_Oct_2005_42.5_41_40_45". The raw dataset in our repo may be a subset of the full dataset acquired from some other source.
Since I can't track down the original location of our current dataset_99.csv, we should probably use the "csiro_warehouse" version on OBIS and re-write the data cleaning script.