EDIorg / ecocomDP

A dataset design pattern and R package for ecological community data.
https://ediorg.github.io/ecocomDP/
Other
32 stars 13 forks source link

Restructure the `read_data()` result #125

Closed clnsmth closed 2 years ago

clnsmth commented 2 years ago

User feedback suggests a restructuring of the dataset object, returned by read_data(), would be helpful. Currently, the result requires a consistently verbose indexing of object attributes (i.e. requires dataset[[i]]$), and doesn't work well in common iteration contexts (e.g. lapply()). Any new structure can be implemented in a backwards compatible way through a controlling format argument, which defaults to the new version, but can return the old if directed.

The current structure:

A proposed structure:

Comments/suggestions are welcome. Thanks!

clnsmth commented 2 years ago

Test drive the proposed changes from the "dataset_format" branch.

Control return structure with the format param.

> old <- read_data("edi.193.5", format = "old")
> str(old,max.level = 2)
List of 1
 $ edi.193.5:List of 3
  ..$ metadata         :List of 1
  ..$ tables           :List of 8
  ..$ validation_issues: list()

> new <- read_data("edi.193.5")
> str(new, max.level = 1)
List of 4
 $ metadata         :List of 1
 $ tables           :List of 8
 $ validation_issues: list()
 $ id               : chr "edi.193.5"

An iteration context:

> ids <- c("edi.193.5", "edi.303.2")

> old <- lapply(ids, read_data, format = "old")
List of 2
 $ :List of 1
  ..$ edi.193.5:List of 3
  .. ..$ metadata         :List of 1
  .. ..$ tables           :List of 8
  .. ..$ validation_issues: list()
 $ :List of 1
  ..$ edi.303.2:List of 3
  .. ..$ metadata         :List of 1
  .. ..$ tables           :List of 6
  .. ..$ validation_issues: list()

> new <- lapply(ids, read_data)
List of 2
 $ :List of 4
  ..$ metadata         :List of 1
  ..$ tables           :List of 8
  ..$ validation_issues: list()
  ..$ id               : chr "edi.193.5"
 $ :List of 4
  ..$ metadata         :List of 1
  ..$ tables           :List of 6
  ..$ validation_issues: list()
  ..$ id               : chr "edi.303.2"
clnsmth commented 2 years ago

The param should probably be named structure not format, and the allowed set of arguments be the package version that format was release in?

sokole commented 2 years ago

this looks good to me. Either structure or format make sense as arguments to me

sokole commented 2 years ago

@clnsmth FYI, updates to PR #120 to flatten_data and the plotting fxns should be compatible with both the new format proposed here and backward compatible with the old format.

clnsmth commented 2 years ago

Thanks for the notification @sokole.

clnsmth commented 2 years ago

This feature is released under v1.2.0