hurlbertlab / core-transient

Data and code for NSF funded research on core vs transient species
7 stars 3 forks source link

d71, Phytoplankton of Barents, White Seas data cleanup #84

Closed ahhurlbert closed 8 years ago

ahhurlbert commented 8 years ago

Raw data were downloaded from iOBIS.org

For ~1100 records out of ~34,000, there is a problem spanning the 'sauthor' and 'tname' fields such that the 'sauthor' value gets split across both fields, pushing all subsequent values in that row one column to the right. All of these seem to have empty double quotes at the end of the string in what is now placed in the 'tname' field.

I will write a preformatting script to fix this, as well as to create a 'site' field from the concatenation of locality and lat_long info.

Should also notify iOBIS and dataset authors of formatting error.

ahhurlbert commented 8 years ago

Issue is now described on Ecological Data Wiki (http://ecologicaldata.org/wiki/phytoplankton-white-sea-barents-sea-norwegian-sea-and-arctic-basin-1993-2003).

We won't be using this dataset for the core-transient project since locations are based purely on lat-longs and do not correspond to physically identical locations of resampling over time.