biobakery / phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
https://huttenhower.sph.harvard.edu/phylophlan
MIT License
122 stars 33 forks source link

Phylophlan Error #103

Open afedynak opened 1 year ago

afedynak commented 1 year ago

Hello, I get the following error trying to run Phylophlan.

phylophlan_write_config_file -o config_file_3 --map_dna diamond --map_aa diamond --db_aa diamond -d a --msa mafft --trim trimal --tree1 raxml --verbose 2>&1 | tee phylophlan_config_3.log

phylophlan -i /metagenomics/ --diversity high --fast --database /phylophlan_databases/phylophlan --genome_extension fastq -f config_file_3 -o /phylophlan_out/qq --nproc 12 --verbose 2>&1 | tee phylophlan_3.log

I saw in a previous post Francisco said to try adding --force_nucleotides, I tried that too but it's still not working.

I get this error: [e] Command '['/cluster/home/ProgramFiles/miniconda3/bin/diamond', 'blastx', '--quiet', '--threads', '1', '--outfmt', '6', '--more-sensitive', '--id', '50', '--max-hsps', '35', '-k', '0', '--query-gencode', '11', '--query', '/cluster/projects/metagenomics/phylophlan_out/tmp/clean_dna/HPB-043_868477_Fec_2.R2.fastq', '--db', '/cluster/home/workflow/run_phylophlan/phylophlan_databases/phylophlan/phylophlan.dmnd', '--out', '/cluster/projects/metagenomics/phylophlan_out/tmp/map_dna/HPB-043_868477_Fec_2.R2.b6o.bkp']' returned non-zero exit status 1.

I also tried running diamond on it's own and it hasn't crashed yet whereas phylophlan crashes right away.

Thank you

fasnicar commented 1 year ago

Hi, from the .fastq extension of your input files, I'm wondering if that's the issue. PhyloPhlAn won't work with fastq format but will require files in fasta format. Diamond will work with fastq, but PhyloPhlAn can't handle them internally.

Many thanks, Francesco

shibormi commented 8 months ago

Hi, I run the same things but still i got an error. If you could please help what might be the possible errors.

phylophlan_v3.0.3.sif phylophlan_write_config_file \ -o custom_config_nt.cfg \ -d n \ --db_dna makeblastdb \ --map_dna blastn \ --force_nucleotides \ --msa muscle \ --trim trimal \ --tree1 fasttree \ --tree2 raxml 2>&1 | tee phylophlan_write_config_file.log

phylophlan_v3.0.3.sif phylophlan \ -i 07_dReplication/Low_Quality/dereplicated_genomes \ --genome_extension .fa \ -d blastx \ -f custom_config_nt.cfg \ --force_nucleotides \ --diversity high \ --fast \ -o output_tol \ --nproc 32 \ --verbose 2>&1 | tee phylophlan.log

error: [e] both db_dna and db_aa are None!

fasnicar commented 7 months ago

Hello @shibormi, it appears that the -d blastx param in the phylophlan command is not correct. That should be the database of markers to use, usually either phylophlan or amphora2 as default available options. I believe the error

error:
[e] both db_dna and db_aa are None!

is due because PhyloPhlAn is not able to recognize that as a valid database of either genes or proteins.

In your config file, you specified

-d n
--db_dna makeblastdb
--map_dna blastn

meaning that you meant to use your custom database of genes (the --force_nucleotides there won't make a difference).

So, either blastx is a database of genes you prepared for PhyloPhlAn or if you aim to reconstruct a very large phylogeny with a tree-of-life diversity (from --diversity high --fast), I would suggest using the universal proteins provided with the phylophlan database.

I hope this helps, but in case not, if you can provide more details I'll try to give you more specific feedback.

Thanks, Francesco