EDIorg / ecocomDP

A dataset design pattern and R package for ecological community data.
https://ediorg.github.io/ecocomDP/
Other
32 stars 13 forks source link

primary keys, unique at some level #11

Closed mobb closed 6 years ago

mobb commented 7 years ago

we need to make sure they will be unique when many datasets are being put together for an analysis.

mobb commented 7 years ago

options:

  1. code can prefix a package_id
  2. recommendations for best practice, eg, see example ids.
mobb commented 7 years ago

related to #9

clnsmth commented 7 years ago

Unique primary keys can be created during the aggregation step. At this point each ecocomDP will have a data package identifier that can be concatenated with an abbreviated primary key name and primary key value (this is essentially option 1 above).

An example of this is: knb-lter-hrf.118.28_obs_1107, where obs denotes observation_id and 1107 is the observation.

mobb commented 7 years ago

colin - I will add "creating a PK function' to our list of functions (creation functions, not aggregation). that way there can be some consistency between tables.

clnsmth commented 6 years ago

validate_primary_keys checks for unique primary keys within an L1, and has been added to the battery of validation tests in validate_ecocomDP. Globally unique primary keys will have to be created for L1 aggregation within the aggregation function (not yet written).