biobakery / phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
https://huttenhower.sph.harvard.edu/phylophlan
MIT License
128 stars 33 forks source link

number of genomes to download in phylophlan_get_reference #72

Open fancyge opened 3 years ago

fancyge commented 3 years ago

Hi, I have a question about the option -n in phylophlan_get_reference.

I used this command to download ref genomes for phylum Bacteroidetes with "-n 200". I thought this will download 200 representative genomes into input_genomes. However, I've seen 900 genomes downloaded and it's still keep downloading. Can you please give me some idea of what this -n really means?

phylophlan_get_reference -g p__Bacteroidetes -o input_genomes/ -m assembly_summary_genbank.txt -n 200

Thanks very much!

fasnicar commented 3 years ago

Hi and many thanks for using PhyloPhlAn!

So the -n parameter is an up to for each species with the given taxonomic label (with the -g param). From the help message:

Specify how many reference genomes to download, where -1 stands for "all available"

So, in your case with -n 200 PhyloPhlAn will download up to 200 genomes for each species that are assigned to the Bacteroidetes phylum.

Thanks for reporting this we will improve the documentation to make this more clear.

fancyge commented 3 years ago

Hi, thanks a lot for the explanation . I appreciate it!