biobakery / phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
https://huttenhower.sph.harvard.edu/phylophlan
MIT License
128 stars 33 forks source link

Both Nucleotide Databases Appear Empty #39

Closed JuliaMcGonigle closed 4 years ago

JuliaMcGonigle commented 4 years ago

Hello! Cool program, trying to get it to work with the supermatrix_nt.cfg file on some of my SAGs and MAGs and ref seqs acquired through GenBank. I am having issues at the first makeblastdb stage. I have tried both options for -d and get the following errors for each:

[e] Command '['/home/jmcgonigle/.conda/envs/phylophlan/bin/makeblastdb', '-parse_seqids', '-dbtype', 'nucl', '-in', 'phylophlan_databases/amphora2/amphora2.fna', '-out', 'phylophlan_databases/amphora2/amphora2']' returned non-zero exit status 1. [e] Command '['/home/jmcgonigle/.conda/envs/phylophlan/bin/makeblastdb', '-parse_seqids', '-dbtype', 'nucl', '-in', 'phylophlan_databases/phylophlan/phylophlan.fna', '-out', 'phylophlan_databases/phylophlan/phylophlan']' returned non-zero exit status 1.

Seems related to the fact that both databases .fna files are empty. I added the --verbose option to see where these files are downloading from and they seem empty from the source if I download them myself and look at the contents.

JuliaMcGonigle commented 4 years ago

I'd like to add that the .faa files within the phylophlan.tar or amphora2.tar files seem fine, it's just the .fna files that appear empty.

fasnicar commented 4 years ago

Hi, many thanks for using PhyloPhlAn.

Both the amphora2 and phylophlan databases are proteins, so you cannot use blast to indexed them as nucleotides. You should use the supermatrix_aa.cfg configuration file instead.

If you want, you can use the --force_nucleotides param in PhyloPhlAn to force the analysis to be done at the nucleotides level instead of amino acids, if your input are all genomes. This though will require you to re-generate the supermatrix_aa.cfg configuration file specifying the --force_nucleotides param as well. You can find the parameters for generating the supermatrix_aa.cfg configuration file in the phylophlan_write_default_configs.sh script, to which you should just add --force_nucleotides.

I hope this helps, Francesco

JuliaMcGonigle commented 4 years ago

Thank you! I think the problem is resolved by changing the config file as suggested and addressing some unique things with write permissions on the server I'm using.