sanger-tol / blobtoolkit

Nextflow pipeline for BlobToolKit for Sanger ToL production suite
https://pipelines.tol.sanger.ac.uk/blobtoolkit
MIT License
11 stars 1 forks source link

Create entrez-direct nf-core module #1

Closed alxndrdiaz closed 2 years ago

alxndrdiaz commented 2 years ago

The entrez-direct nf-core module is a dependency in BlobToolKitPipeline. "Entrez Direct (EDirect) provides access to the NCBI's suite of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a Unix terminal window. Search terms are entered as command-line arguments. Individual operations are connected with Unix pipes to construct multi-step queries. Selected records can then be retrieved in a variety of formats".

alxndrdiaz commented 2 years ago

Three entrez-direct utilities are required: esearch, esummary and xtract, see function src/btk_pipeline/generate_config.py in blobtoolkit GitHub repo. This means a separate module should be created for each utility: entrezdirez/esearch, etc.