A few small bug fixes and some additional helper functions to retrieve information from the NCBI:
lookup_biosamples leverages the rentrez package to retrieve sample annotations from the NCBI Biosample database. It's fast, but in its current implementation adds a number of dependencies, e.g. rentrez and xml2, as well as a few tidyverse packages. Some of the latter could be trimmed (e.g. readr).
lookup_gse and its unexported backend query_geo directly access NCBI GEO's REST API to retrieve Series-level information, e.g. the sample identifiers associated with a series. (These could then be passed on to lookup_biosamples.)
GEO's REST API (and hencequery_geo) can also retrieve sample-level information, but processes only one sample at a time. That's why I chose the rentrez path for lookup_biosamples instead.
I haven't updated the DESCRIPTION or NAMESPACE files, nor written or run any tests. Take a look and then decide if these functions are useful and should find a place in this package. Alternatively, we could keep them in a separate package. (I have a few more tricks like this up my sleeve for querying EBI's SRA database that could find their place in a separate package.)
A few small bug fixes and some additional helper functions to retrieve information from the NCBI:
lookup_biosamples
leverages therentrez
package to retrieve sample annotations from the NCBI Biosample database. It's fast, but in its current implementation adds a number of dependencies, e.g.rentrez
andxml2
, as well as a few tidyverse packages. Some of the latter could be trimmed (e.g.readr
).lookup_gse
and its unexported backendquery_geo
directly access NCBI GEO's REST API to retrieve Series-level information, e.g. the sample identifiers associated with a series. (These could then be passed on tolookup_biosamples
.) GEO's REST API (and hencequery_geo
) can also retrieve sample-level information, but processes only one sample at a time. That's why I chose therentrez
path forlookup_biosamples
instead.I haven't updated the DESCRIPTION or NAMESPACE files, nor written or run any tests. Take a look and then decide if these functions are useful and should find a place in this package. Alternatively, we could keep them in a separate package. (I have a few more tricks like this up my sleeve for querying EBI's SRA database that could find their place in a separate package.)
Best, Thomas