Open mahesh-panchal opened 9 months ago
The simplest solution I could find is to use Entrez Direct efetch
to search NCBI taxonomy using the taxonomy ID (that we currently get from the TAXONKIT_NAME2LINEAGE process:
efetch -db taxonomy -id ${taxid} -format xml | awk -F'[<>]' '/<MGCId>/ {print $3}' > mitocode.txt
nf-core has some of the Entrez Direct tools but not efetch
. Entrez Direct is in bioconda so a container can be pulled from Seqera.
Want to have a shot at writing either an nf-core module or a local module for it?
Make a fork of nf-core modules.
git checkout -b entrezdirect_efetch
Then make a new module
nf-core modules create entrezdirect/efetch
but it means you'll also have to write an nf-test too.
Make a fork of the EBP pilot workflow
git checkout -b entrezdirect_efetch
nf-core modules create entrezdirect/efetch
but here you won't have to write an nf-test and don't need to follow nf-core guidelines.
Then make PR back to the main workflow
Mitohifi and perhaps other tools need to specify the codon usage table to use. Can this be automated?