How to download the full database

TIBHannover / BacDiveR

Inofficial R client for the DSMZ's Bacterial Diversity Metadatabase (former contact: @katrinleinweber). https://api.bacdive.dsmz.de/client_examples seems to be the official alternatives.

https://TIBHannover.GitHub.io/BacDiveR/

MIT License

10 stars 12 forks source link

How to download the full database #114

Closed zji90 closed 3 years ago

zji90 commented 3 years ago

Hi,

I wonder whether there is a way to download the full BacDive database without doing the search? Thanks!

katrinleinweber commented 3 years ago

Hi @zji90 👋 It's probably best to contact the official BacDive team about that.

From my outside observation (now almost 2 years ago), I think there are 2 options in which BacDiveR can help you unofficially:

Iterating up from bd_retrieve(id = 1) to (currently 82892), with a reasonable wait time in between (half a second?) and some client-side catching/skipping of errors (some IDs were nonexistent).
Mass-downloading datasets with a few very general search queries, like Domain contains Bacteria and Domain contains Archaea.

JackLMc commented 3 years ago

For others that found this... This is the reply I received from the BacDive team:

"BacDive is a complex database with over 1000 data fields. Therefore there is no easy way to download the whole database (e.g. as a plain spreadsheet). We recommend using the API for accessing data for many strains at once. Please have a look. Access is free after registration and we do provide clients for Python and R.

If you a further questions don't hesitate to ask.

Best regards,"

katrinleinweber commented 3 years ago

I wonder why they never contributed to this client, despite seeking cooperation projects with TIB. Case closed then, also because of #118.