biobakery / phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
https://huttenhower.sph.harvard.edu/phylophlan
MIT License
121 stars 33 forks source link

UnboundLocalError: local variable 'input_fna_clean' referenced before assignment #42

Closed fujch7 closed 3 years ago

fujch7 commented 4 years ago

Hi developer, I met the error below. How could I deal with it? Besides, for arguments -d, after first time I ran the command, the database file phylophlan.dmnd was produced. Could I just specified the path of database like /path/phylophlan_databases/phylophlan/phylophlan.dmndfor my next running command? It would be appreciated for your help.

(phylophlan) [root@instance-xvlawteu phylophlan3]# phylophlan -i /home/prodigal/output/0827fna/ --accurate --diversity low -d phylophlan -f supermatrix_aa.cfg -t a --nproc 2 --output_folder /home/phylophlan3/phy_output/ --proteome_extension faa Loading files from "/home/prodigal/output/0827fna" Traceback (most recent call last): File "/root/anaconda3/envs/phylophlan/bin/phylophlan", line 10, in sys.exit(phylophlan_main()) File "/root/anaconda3/envs/phylophlan/lib/python3.8/site-packages/phylophlan/phylophlan.py", line 3213, in phylophlan_main standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa) File "/root/anaconda3/envs/phylophlan/lib/python3.8/site-packages/phylophlan/phylophlan.py", line 2910, in standard_phylogeny_reconstruction if input_fna_clean: UnboundLocalError: local variable 'input_fna_clean' referenced before assignment

fujch7 commented 4 years ago

The above problem occurred after I used the amino sequences in FAA file as input.

If I use genomes as input and use --nproc params, there will be another problems when mapping. see below: _'/home/fujch/.local/share/flatpak/exports/share/:/var/lib/flatpak/exports/share/:/usr/local/share/:/usr/share/', 'DBUS_SESSION_BUS_ADDRESS': 'unix:abstract=/tmp/dbus-YtQYAttD9u,guid=bf43da6c507aafd9f4b9dbb25f54b77a', 'LESSOPEN': '||/usr/bin/lesspipe.sh %s', 'CONDA_DEFAULT_ENV': 'phylophlan', 'WINDOWPATH': '1', 'XDG_RUNTIME_DIR': '/run/user/1000', 'DISPLAY': ':0', 'XDG_CURRENT_DESKTOP': 'GNOME-Classic:GNOME', 'COLORTERM': 'truecolor', 'XAUTHORITY': '/root/.xauth3BEJWC'}

[e] Command '['/root/anaconda3/envs/phylophlan/bin/diamond', 'blastx', '--quiet', '--threads', '1', '--outfmt', '6', '--more-sensitive', '--id', '50', '--max-hsps', '35', '-k', '0', '--query', '/mnt/hgfs/share/phylophlan3/output/0827fna_phylophlan/tmp/clean_dna/Aerosticca_soli_Dysh456_T_GCA_003967035.1_genomic.fna', '--db', 'phylophlan_databases/phylophlan/phylophlan.dmnd', '--out', '/mnt/hgfs/share/phylophlan3/output/0827fna_phylophlan/tmp/map_dna/Aerosticca_soli_Dysh456_T_GCA_003967035.1_genomic.b6o.bkp']' died with <Signals.SIGABRT: 6>.

[e] error while mapping {'program_name': '/root/anaconda3/envs/phylophlan/bin/diamond', 'params': 'blastx --quiet --threads 1 --outfmt 6 --more-sensitive --id 50 --max-hsps 35 -k 0', 'input': '--query', 'database': '--db', 'output': '--out', 'version': 'version', 'command_line': '#program_name# #params# #input# #database# #output#'} /mnt/hgfs/share/phylophlan3/output/0827fna_phylophlan/tmp/clean_dna/Aerosticca_soli_Dysh456_T_GCA_003967035.1_genomic.fna phylophlan_databases/phylophlan/phylophlan.dmnd /mnt/hgfs/share/phylophlan3/output/0827fna_phylophlan/tmp/map_dna Aerosticca_soli_Dysh456_T_GCA_003967035.1_genomic.b6o.bkp False

[e] Command '['/root/anaconda3/envs/phylophlan/bin/diamond', 'blastx', '--quiet', '--threads', '1', '--outfmt', '6', '--more-sensitive', '--id', '50', '--max-hsps', '35', '-k', '0', '--query', '/mnt/hgfs/share/phylophlan3/output/0827fna_phylophlan/tmp/clean_dna/Aerosticca_soli_Dysh456_T_GCA_003967035.1_genomic.fna', '--db', 'phylophlan_databases/phylophlan/phylophlan.dmnd', '--out', '/mnt/hgfs/share/phylophlan3/output/0827fna_phylophlan/tmp/map_dna/Aerosticca_soli_Dysh456_T_GCA_003967035.1_genomic.b6o.bkp']' died with <Signals.SIGABRT: 6>.

[e] gene_markersidentification crashed

And I have another question. Whether I use nucleotide or amino acid files as input, phylogenetic trees are based on amino acid sequences as long as parameter --force_nucletides are not used. And I can use --force_nucletides to force the phylogeny tree being built based on nucleotide sequence?

fasnicar commented 4 years ago

Hi,

Can you please provide the version of PhyloPhlAn you're using (--version)?


For the -d parameter, the next time you should just specify the same -d phylophlan and if you use diamond it will be detected that the indexed version is already present.


UnboundLocalError: local variable 'input_fna_clean' referenced before assignment

what is the content of the /home/prodigal/output/0827fna/ folder?


And I have another question. Whether I use nucleotide or amino acid files as input, phylogenetic trees are based on amino acid sequences as long as parameter --force_nucletides are not used. And I can use --force_nucletides to force the phylogeny tree being built based on nucleotide sequence?

If you use --force_nucleotides only input genomes will be used and this is generally used when your inputs are only genomes and the database a set of proteins. If your inputs are a mix of genomes and proteomes then you can only build a phylogeny based on amino acids and so you should not specify the --force_nucleotides parameter.

fujch7 commented 4 years ago
  1. PhyloPhlAn version 3.0.54 (26 August 2020)
  2. /home/prodigal/output/0827fna/ folder contains protein sequences predicted by prodigal (v2.6.3), and the character "" in the output file produced by prodigal has been delete. I know the character "" will cause an error in phylophlan version 1.
  3. when inputs are only genomes and the database are proteins, using --force_nucleotideswill make the phylogeny built based on nucleotides sequence. Is it right?
fasnicar commented 4 years ago

Many thanks for the version.

I guess the character is the "_", right? I don't think that should be a problem with the new version of PhyloPhlAn.

For 3, yes, if your inputs are genomes and the database proteins, with --force_nucleotides the phylogeny will be built using nucleotides.

Is your input folder containing also genomes?

fasnicar commented 4 years ago

Hi, this should be fixed with commit d723a3c703f41a361d153949460ba561b9879478. This version is not yet packaged in Bioconda, so you should get the code directly from the repository.

Many thanks, Francesco