TIBHannover / BacDiveR

Inofficial R client for the DSMZ's Bacterial Diversity Metadatabase (former contact: @katrinleinweber). https://api.bacdive.dsmz.de/client_examples seems to be the official alternatives.
https://TIBHannover.GitHub.io/BacDiveR/
MIT License
10 stars 12 forks source link

aggregate datasets into useful structure before returning #31

Closed katrinleinweber closed 6 years ago

katrinleinweber commented 6 years ago

noticed while working on #16

retrieve_data() currently appends multiple downloads into a continuous list in which the datasets can't be addressed anymore. We need a data structure, that lets the user $-address the datasets, and their fields. Ideally, each dataset is referred to by index = bacdive_id. Something like a sparse list-of-lists?!?

ideas:

katrinleinweber commented 6 years ago

jsonlite::fromJSON(…, flatten = TRUE) and simplifyDataFrame = TRUE both still return a list of nested lists with DFs as "leaves". Still need to work out how to extract a field/element (say culture_growth_condition$culture_temp$temp from a combination of these list-of-lists :-/

screen shot 2018-03-12 at 16 09 58

katrinleinweber commented 6 years ago

@sckott: Hello, and thanks for your advice! I got over this data structure problem :-)

katrinleinweber commented 6 years ago

For comparison with the above screen shot: between

a) data above / Bac_hal_data in this example, and c) the lists (taxonomy_name, morphology_physiology, …, environment_sampli…, etc.) within the datasets, is now b) a list-of-list for each dataset, named by its numeric BacDive ID (1095 & 1847)

screen shot 2018-04-18 at 16 44 09