EDIorg / ecocomDP

A dataset design pattern and R package for ecological community data.
https://ediorg.github.io/ecocomDP/
Other
32 stars 13 forks source link

get feedback on most useful wide format for scientists #95

Closed mobb closed 3 years ago

mobb commented 4 years ago

Mostly affects the functions in manipulate_tables.R. These join the required tables (obs, loc, taxon), with the loc table un-nested, and add lat log to each line (most detailed loc available), so that each row includes (min)

datetime, taxon, site-name, lat, lon, and any variables.

Actions still to decide

  1. Leave in L1 identifiers? there are 4: observation_id, event_id, site_id, taxon_id. we added the observation and event ids, but taxon_id (and possibly site_id) come from the L0 file, and we add them if none were present.
  2. pivot values? scientists seem generally more comfortable with wide tables, rather than long. We may not need the wide table for GBIF (still TBD what we send to them)
clnsmth commented 3 years ago

@mobb, users can get a fully joined and "wide" table via flatten_data().