Open jmtsuji opened 3 months ago
Hi there, @jmtsuji!
Thanks for the kind words!
I will look into adding an option for this when I can, or at least there’s certainly no reason they shouldn’t be saved with the debug flag like you tried!
You mentioned you could download them yourself, but I’ll also note have the same NCBI download functionality packaged with my bit package for this very purpose, it just takes input assembly accessions just like GToTree.
The conda install steps are here: https://github.com/AstrobioMike/bit?tab=readme-ov-file#conda-install
and then you’d want the program bit-dl-ncbi-assemblies
, and passing -f protein
along with the input wanted accessions would download the amino acid files if they are available. If that’s helpful to you
Thanks for the suggestion!
@AstrobioMike Thanks for the quick response! Also, good to know about bit
; bit-dl-ncbi-assemblies
could potentially be quite useful. All the best!
Thanks so much for your continued work on this really helpful workflow, @AstrobioMike !
I have a feature request (so low priority) regarding GToTree. Currently, if a list of NCBI assembly accession numbers is provided as input to GToTree (via
-a
), GToTree automatically downloads the genome for each accession, predicts amino acids when amino acid files don't already exist, and then runs the SCG search/alignment workflow. Being able to download genomes from NCBI like this is extremely helpful. However, I sometimes find myself wanting to work with the amino acid sequence files for the analyzed genomes after GToTree is finished. It seems like GToTree deletes these amino acid files (and does not save them even in the tmp directory with-d
, debug mode). Might it be possible to add a flag to keep these files or to preserve them when debug mode (-d
) is set?Again, this is not urgent, because I can just download the genomes again myself if needed. Thanks so much in advance, and again, I have so appreciated this useful tool!