biobakery / phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
https://huttenhower.sph.harvard.edu/phylophlan
MIT License
128 stars 33 forks source link

local variable 'input_faa_clean' referenced before assignment #9

Closed mcahn closed 4 years ago

mcahn commented 4 years ago

Hi, still trying to get a successful run on Example 02 tree of life. I'm using the first 10 files in input_genomes to make a shorter test.

Traceback (most recent call last): File "/tigress/MOLBIO/local/pythonenv/phylophlan/bin/phylophlan", line 11, in <module> load_entry_point('PhyloPhlAn==3.0', 'console_scripts', 'phylophlan')() File "/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan.py", line 3200, in phylophlan_main standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa) File "/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan.py", line 3005, in standard_phylogeny_reconstruction all_inputs = (os.path.splitext(os.path.basename(i))[0] for i in input_faa_clean) UnboundLocalError: local variable 'input_faa_clean' referenced before assignment

Best, Matthew Cahn

fasnicar commented 4 years ago

Hi Matthew,

Can you provide the full command line? Also, it will be helpful if you can remove the output folder, re-run PhyloPhlAn with the --verbose option and save the output and send that as well.

A first explanation of the error, you're running PhyloPhlAn with the --force_nucleotides param, but you end up with no alignment files, which is problematic because it either means that (1) the parameters you set discard all alignment files, or (2) no enough marker genes are found in your 10 test input genomes and hence not used to extract the markers.

Many thanks, Francesco

mcahn commented 4 years ago

Hi Francesco,

` PhyloPhlAn version 0.48 (22 April 2020)

Command line: /tigress/MOLBIO/local/pythonenv/phylophlan/bin/phylophlan -i input_genomes -d phylophlan -f 02_tol.cfg --diversity high --fast -o output_tol --nproc 28 --verbose

Automatically setting "input=input_genomes" and "input_folder=/tigress/mcahn/phylophlan3_test" Creating folder "output_tol" Creating folder "output_tol/tmp" "high-fast" preset Setting "sort=True" because "database=phylophlan" Arguments: {'input': 'input_genomes', 'clean': None, 'output': 'output_tol', 'database': 'phylophlan', 'db_type': None, 'config_file': '02_tol.cfg', 'diversity': 'high', 'accurate': False, 'fast': True, 'clean_all': False, 'database_list': False, 'submat': 'pfasum60', 'submat_list': False, 'submod_list': False, 'nproc': 28, 'min_num_proteins': 1, 'min_len_protein': 50, 'min_num_markers': 1, 'trim': 'greedy', 'gap_perc_threshold': 0.67, 'not_variant_threshold': 0.9, 'subsample': <function phylophlan at 0x2ac2becac7a0>, 'unknown_fraction': 0.3, 'scoring_function': <function trident at 0x2ac2becacdd0>, 'sort': True, 'remove_fragmentary_entries': False, 'fragmentary_threshold': 0.67, 'min_num_entries': 4, 'maas': None, 'remove_only_gaps_entries': False, 'mutation_rates': False, 'force_nucleotides': False, 'input_folder': '/tigress/mcahn/phylophlan3_test/input_genomes', 'data_folder': 'output_tol/tmp', 'databases_folder': 'phylophlan_databases/', 'submat_folder': '/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan_substitution_matrices/', 'submod_folder': '/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan_substitution_models/', 'configs_folder': '/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan_configs/', 'output_folder': '', 'genome_extension': '.fna', 'proteome_extension': '.faa', 'update': False, 'verbose': True} Loading configuration file "02_tol.cfg" Checking configuration file Checking "/tigress/MOLBIO/local/diamond-0.9.31/diamond" Checking "/tigress/MOLBIO/local/bin/mafft" Checking "/tigress/MOLBIO/local/bin/trimal" Checking "/tigress/MOLBIO/local/bin/iqtree" "db_aa" database "phylophlan_databases/phylophlan/phylophlan.dmnd" present Loading files from "/tigress/mcahn/phylophlan3_test/input_genomes" Decompressing input files Creating folder "output_tol/tmp/uncompressed" Creating folder "output_tol/tmp/map_dna" Mapping "phylophlan" on 10 inputs (key: "map_dna") Mapping "output_tol/tmp/uncompressed/GCA_000006805.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006925.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006745.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006765.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006175.fna" Mapping "output_tol/tmp/uncompressed/GCA_000005825.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006335.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006985.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006845.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006905.fna" "GCA_000006175.b6o.bkp" generated in 442s "GCA_000006805.b6o.bkp" generated in 508s "GCA_000006845.b6o.bkp" generated in 544s "GCA_000006985.b6o.bkp" generated in 548s "GCA_000006905.b6o.bkp" generated in 646s "GCA_000005825.b6o.bkp" generated in 679s "GCA_000006925.b6o.bkp" generated in 682s "GCA_000006745.b6o.bkp" generated in 686s "GCA_000006765.b6o.bkp" generated in 785s "GCA_000006335.b6o.bkp" generated in 789s Selecting 10 markers from "output_tol/tmp/map_dna" Selecting "output_tol/tmp/map_dna/GCA_000006745.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006985.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006905.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000005825.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006175.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006805.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006335.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006765.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006845.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006925.b6o.bkp" "output_tol/tmp/map_dna/GCA_000006175.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006805.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006335.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006985.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006845.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000005825.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006765.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006905.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006925.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006745.b6o.bz2" generated in 1s Creating folder "output_tol/tmp/markers_dna" Extracting markers from 10 inputs Extracting "output_tol/tmp/map_dna/GCA_000006805.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006745.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006335.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000005825.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006765.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006905.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006845.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006175.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006925.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006985.b6o.bz2" "output_tol/tmp/markers_dna/GCA_000006175.fna.bz2" generated in 1s "output_tol/tmp/markers_dna/GCA_000006335.fna.bz2" generated in 2s "output_tol/tmp/markers_dna/GCA_000006805.fna.bz2" generated in 5s "output_tol/tmp/markers_dna/GCA_000006845.fna.bz2" generated in 11s "output_tol/tmp/markers_dna/GCA_000006985.fna.bz2" generated in 14s "output_tol/tmp/markers_dna/GCA_000006745.fna.bz2" generated in 40s "output_tol/tmp/markers_dna/GCA_000006905.fna.bz2" generated in 45s "output_tol/tmp/markers_dna/GCA_000005825.fna.bz2" generated in 59s "output_tol/tmp/markers_dna/GCA_000006925.fna.bz2" generated in 71s "output_tol/tmp/markers_dna/GCA_000006765.fna.bz2" generated in 143s Creating folder "output_tol/tmp/fake_proteomes" Fake proteomes already generated Loading files from "output_tol/tmp/fake_proteomes" Loading files from "/tigress/mcahn/phylophlan3_test/input_genomes" Checking 0 inputs Creating folder "output_tol/tmp/markers" Creating folder "output_tol/tmp/msas" Markers already aligned (key: "msa") Creating folder "output_tol/tmp/trim_gap_trim" Markers already trimmed (key: "trim") Creating folder "output_tol/tmp/trim_gap_perc" Markers already trimmed Creating folder "output_tol/tmp/trim_not_variant" Markers already trimmed Substitution matrix "/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan_substitution_matrices/pfasum60.pkl" loaded Creating folder "output_tol/tmp/sub" Markers already subsampled Traceback (most recent call last): File "/tigress/MOLBIO/local/pythonenv/phylophlan/bin/phylophlan", line 11, in load_entry_point('PhyloPhlAn==3.0', 'console_scripts', 'phylophlan')() File "/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan.py", line 3200, in phylophlan_main standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa) File "/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan.py", line 3005, in standard_phylogeny_reconstruction all_inputs = (os.path.splitext(os.path.basename(i))[0] for i in input_faa_clean) UnboundLocalError: local variable 'input_faa_clean' referenced before assignment srun: error: tiger-h21d1: task 0: Exited with exit code 1 srun: Terminating job step 4649460.0 `

zckoo007 commented 4 years ago

I have the same problem

I download phylophlan version 3, and learn the High-resolution phylogeny of 135 Staphylococcus aureus isolate genomes when I run Step 3. Build the phylogeny of the 135 S. aureus isolate genomes

phylophlan \
    -i input_isolates \
    -o output_isolates \
    -d s__Staphylococcus_aureus \
    --trim greedy \
    --not_variant_threshold 0.99 \
    --remove_fragmentary_entries \
    --fragmentary_threshold 0.67 \
    --min_num_entries 135 \
    -t a \
    -f isolates_config.cfg \
    --diversity low \
    --force_nucleotides \
    --nproc 4 \
    --verbose 2>&1 | tee logs/phylophlan__output_isolates.log

I get an error

Traceback (most recent call last):
  File "/miniconda3/envs/phylophlan/bin/phylophlan", line 10, in <module>
    sys.exit(phylophlan_main())
  File "/miniconda3/envs/phylophlan/lib/python3.7/site-packages/phylophlan/phylophlan.py", line 3229, in phylophlan_main
    standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa)
  File "/miniconda3/envs/phylophlan/lib/python3.7/site-packages/phylophlan/phylophlan.py", line 3035, in standard_phylogeny_reconstruction
    all_inputs = (os.path.splitext(os.path.basename(i))[0] for i in input_faa_clean)
UnboundLocalError: local variable 'input_faa_clean' referenced before assignment
zckoo007 commented 4 years ago

it seems the question is here

Fake proteomes already generated
Loading files from "output_tol/tmp/fake_proteomes"
Loading files from "/tigress/mcahn/phylophlan3_test/input_genomes"
Checking 0 inputs

The dir fake_proteomes is empty, so there is 0 inputs ?

fasnicar commented 4 years ago

Hi, in your case

I have the same problem

I download phylophlan version 3, and learn the High-resolution phylogeny of 135 Staphylococcus aureus isolate genomes when I run Step 3. Build the phylogeny of the 135 S. aureus isolate genomes

phylophlan \
    -i input_isolates \
    -o output_isolates \
    -d s__Staphylococcus_aureus \
    --trim greedy \
    --not_variant_threshold 0.99 \
    --remove_fragmentary_entries \
    --fragmentary_threshold 0.67 \
    --min_num_entries 135 \
    -t a \
    -f isolates_config.cfg \
    --diversity low \
    --force_nucleotides \
    --nproc 4 \
    --verbose 2>&1 | tee logs/phylophlan__output_isolates.log

I get an error

Traceback (most recent call last):
  File "/miniconda3/envs/phylophlan/bin/phylophlan", line 10, in <module>
    sys.exit(phylophlan_main())
  File "/miniconda3/envs/phylophlan/lib/python3.7/site-packages/phylophlan/phylophlan.py", line 3229, in phylophlan_main
    standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa)
  File "/miniconda3/envs/phylophlan/lib/python3.7/site-packages/phylophlan/phylophlan.py", line 3035, in standard_phylogeny_reconstruction
    all_inputs = (os.path.splitext(os.path.basename(i))[0] for i in input_faa_clean)
UnboundLocalError: local variable 'input_faa_clean' referenced before assignment

the situation is different as you're specifying the --force_nucleotides which disable the proteomes generation.

Can you attach the log file as well?

Many thanks, Francesco

zckoo007 commented 4 years ago

sorry I delete the step3 log, but step4 is the same problem

phylophlan.py \
    -i input_references \
    -o output_references \
    -d s__Staphylococcus_aureus \
    -t a -f references_config.cfg \
    --nproc 28 \
    --subsample twentyfivepercent \
    --diversity low \
    --fast \
    2>&1 |tee logs/phylophlan__reference_genomes__s__Staphylococcus_aureus.log

log

Loading files from "/software/phylophlan/phylophlan/phylophlan/examples/01_saureus/input_references"
Mapping "s__Staphylococcus_aureus" on 40 inputs (key: "map_dna")
Mapping "output_references/tmp/uncompressed/GCA_900041035.fna"
Mapping "output_references/tmp/uncompressed/GCA_000536615.fna"
Mapping "output_references/tmp/uncompressed/GCA_000556505.fna"
...
Mapping "output_references/tmp/uncompressed/GCA_000602085.fna"
"GCA_002274055.b6o.bkp" generated in 7s
Mapping "output_references/tmp/uncompressed/GCA_000558485.fna"
"GCA_000602085.b6o.bkp" generated in 8s
...
"GCA_000535955.b6o.bkp" generated in 10s
Mapping "output_references/tmp/uncompressed/GCA_003240135.fna"
"GCA_000637375.b6o.bkp" generated in 10s
"GCA_900046185.b6o.bkp" generated in 10s
...
"GCA_001027105.b6o.bkp" generated in 21s
Selecting 1135 markers from "output_references/tmp/map_dna"
Selecting "output_references/tmp/map_dna/GCA_003239235.b6o.bkp"
Selecting "output_references/tmp/map_dna/GCA_900038765.b6o.bkp"
...
Selecting "output_references/tmp/map_dna/GCA_900020425.b6o.bkp"
"output_references/tmp/map_dna/GCA_002097525.b6o.bz2" generated in 0s
"output_references/tmp/map_dna/GCA_000561485.b6o.bz2" generated in 0s
...
"output_references/tmp/markers_dna/GCA_000636715.fna.bz2" generated in 0s
Extracting "output_references/tmp/map_dna/GCA_001212145.b6o.bz2"
"output_references/tmp/markers_dna/GCA_900018705.fna.bz2" generated in 0s
"output_references/tmp/markers_dna/GCA_900049425.fna.bz2" generated in 0s
Extracting "output_references/tmp/map_dna/GCA_900064245.b6o.bz2"
...
"output_references/tmp/markers_dna/GCA_001045995.fna.bz2" generated in 5s
"output_references/tmp/markers_dna/GCA_000025145.fna.bz2" generated in 5s
Fake proteomes already generated
Loading files from "output_references/tmp/fake_proteomes"
Loading files from "/public/home/sample_lib/ckzhu/software/phylophlan/phylophlan/phylophlan/examples/01_saureus/input_references"
Checking 0 inputs
Markers already aligned (key: "msa")
Markers already trimmed (key: "trim")
Markers already trimmed
Markers already trimmed
Markers already subsampled
Traceback (most recent call last):
  File "phylophlan.py", line 3205, in <module>
    phylophlan_main()
  File "phylophlan.py", line 3200, in phylophlan_main
    standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa)
  File "phylophlan.py", line 3005, in standard_phylogeny_reconstruction
    all_inputs = (os.path.splitext(os.path.basename(i))[0] for i in input_faa_clean)
UnboundLocalError: local variable 'input_faa_clean' referenced before assignment
fasnicar commented 4 years ago

Hi Francesco,

` PhyloPhlAn version 0.48 (22 April 2020)

Command line: /tigress/MOLBIO/local/pythonenv/phylophlan/bin/phylophlan -i input_genomes -d phylophlan -f 02_tol.cfg --diversity high --fast -o output_tol --nproc 28 --verbose

Automatically setting "input=input_genomes" and "input_folder=/tigress/mcahn/phylophlan3_test" Creating folder "output_tol" Creating folder "output_tol/tmp" "high-fast" preset Setting "sort=True" because "database=phylophlan" Arguments: {'input': 'input_genomes', 'clean': None, 'output': 'output_tol', 'database': 'phylophlan', 'db_type': None, 'config_file': '02_tol.cfg', 'diversity': 'high', 'accurate': False, 'fast': True, 'clean_all': False, 'database_list': False, 'submat': 'pfasum60', 'submat_list': False, 'submod_list': False, 'nproc': 28, 'min_num_proteins': 1, 'min_len_protein': 50, 'min_num_markers': 1, 'trim': 'greedy', 'gap_perc_threshold': 0.67, 'not_variant_threshold': 0.9, 'subsample': <function phylophlan at 0x2ac2becac7a0>, 'unknown_fraction': 0.3, 'scoring_function': <function trident at 0x2ac2becacdd0>, 'sort': True, 'remove_fragmentary_entries': False, 'fragmentary_threshold': 0.67, 'min_num_entries': 4, 'maas': None, 'remove_only_gaps_entries': False, 'mutation_rates': False, 'force_nucleotides': False, 'input_folder': '/tigress/mcahn/phylophlan3_test/input_genomes', 'data_folder': 'output_tol/tmp', 'databases_folder': 'phylophlan_databases/', 'submat_folder': '/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan_substitution_matrices/', 'submod_folder': '/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan_substitution_models/', 'configs_folder': '/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan_configs/', 'output_folder': '', 'genome_extension': '.fna', 'proteome_extension': '.faa', 'update': False, 'verbose': True} Loading configuration file "02_tol.cfg" Checking configuration file Checking "/tigress/MOLBIO/local/diamond-0.9.31/diamond" Checking "/tigress/MOLBIO/local/bin/mafft" Checking "/tigress/MOLBIO/local/bin/trimal" Checking "/tigress/MOLBIO/local/bin/iqtree" "db_aa" database "phylophlan_databases/phylophlan/phylophlan.dmnd" present Loading files from "/tigress/mcahn/phylophlan3_test/input_genomes" Decompressing input files Creating folder "output_tol/tmp/uncompressed" Creating folder "output_tol/tmp/map_dna" Mapping "phylophlan" on 10 inputs (key: "map_dna") Mapping "output_tol/tmp/uncompressed/GCA_000006805.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006925.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006745.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006765.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006175.fna" Mapping "output_tol/tmp/uncompressed/GCA_000005825.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006335.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006985.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006845.fna" Mapping "output_tol/tmp/uncompressed/GCA_000006905.fna" "GCA_000006175.b6o.bkp" generated in 442s "GCA_000006805.b6o.bkp" generated in 508s "GCA_000006845.b6o.bkp" generated in 544s "GCA_000006985.b6o.bkp" generated in 548s "GCA_000006905.b6o.bkp" generated in 646s "GCA_000005825.b6o.bkp" generated in 679s "GCA_000006925.b6o.bkp" generated in 682s "GCA_000006745.b6o.bkp" generated in 686s "GCA_000006765.b6o.bkp" generated in 785s "GCA_000006335.b6o.bkp" generated in 789s Selecting 10 markers from "output_tol/tmp/map_dna" Selecting "output_tol/tmp/map_dna/GCA_000006745.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006985.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006905.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000005825.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006175.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006805.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006335.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006765.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006845.b6o.bkp" Selecting "output_tol/tmp/map_dna/GCA_000006925.b6o.bkp" "output_tol/tmp/map_dna/GCA_000006175.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006805.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006335.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006985.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006845.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000005825.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006765.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006905.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006925.b6o.bz2" generated in 0s "output_tol/tmp/map_dna/GCA_000006745.b6o.bz2" generated in 1s Creating folder "output_tol/tmp/markers_dna" Extracting markers from 10 inputs Extracting "output_tol/tmp/map_dna/GCA_000006805.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006745.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006335.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000005825.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006765.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006905.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006845.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006175.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006925.b6o.bz2" Extracting "output_tol/tmp/map_dna/GCA_000006985.b6o.bz2" "output_tol/tmp/markers_dna/GCA_000006175.fna.bz2" generated in 1s "output_tol/tmp/markers_dna/GCA_000006335.fna.bz2" generated in 2s "output_tol/tmp/markers_dna/GCA_000006805.fna.bz2" generated in 5s "output_tol/tmp/markers_dna/GCA_000006845.fna.bz2" generated in 11s "output_tol/tmp/markers_dna/GCA_000006985.fna.bz2" generated in 14s "output_tol/tmp/markers_dna/GCA_000006745.fna.bz2" generated in 40s "output_tol/tmp/markers_dna/GCA_000006905.fna.bz2" generated in 45s "output_tol/tmp/markers_dna/GCA_000005825.fna.bz2" generated in 59s "output_tol/tmp/markers_dna/GCA_000006925.fna.bz2" generated in 71s "output_tol/tmp/markers_dna/GCA_000006765.fna.bz2" generated in 143s Creating folder "output_tol/tmp/fake_proteomes" Fake proteomes already generated Loading files from "output_tol/tmp/fake_proteomes" Loading files from "/tigress/mcahn/phylophlan3_test/input_genomes" Checking 0 inputs Creating folder "output_tol/tmp/markers" Creating folder "output_tol/tmp/msas" Markers already aligned (key: "msa") Creating folder "output_tol/tmp/trim_gap_trim" Markers already trimmed (key: "trim") Creating folder "output_tol/tmp/trim_gap_perc" Markers already trimmed Creating folder "output_tol/tmp/trim_not_variant" Markers already trimmed Substitution matrix "/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan_substitution_matrices/pfasum60.pkl" loaded Creating folder "output_tol/tmp/sub" Markers already subsampled Traceback (most recent call last): File "/tigress/MOLBIO/local/pythonenv/phylophlan/bin/phylophlan", line 11, in load_entry_point('PhyloPhlAn==3.0', 'console_scripts', 'phylophlan')() File "/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan.py", line 3200, in phylophlan_main standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa) File "/tigress/MOLBIO/local/pythonenv/phylophlan/lib/python3.7/site-packages/PhyloPhlAn-3.0-py3.7.egg/phylophlan/phylophlan.py", line 3005, in standard_phylogeny_reconstruction all_inputs = (os.path.splitext(os.path.basename(i))[0] for i in input_faa_clean) UnboundLocalError: local variable 'input_faa_clean' referenced before assignment srun: error: tiger-h21d1: task 0: Exited with exit code 1 srun: Terminating job step 4649460.0 `

Many thanks Matthew for reporting this. I think I got now where the problem is. I recently added the support to handle compressed files, but that function is still looking for uncompressed files.

With commit 5fdce525a18b0a0f9be53e129bde80d3cc5a7205 I should have fixed this issue, which should also be the same reported by @zckoo007.

Unfortunately, I won't be able to update the package in Bioconda with this fix, so if you can get PhyloPhlAn directly from the repository, that would be great.

Many thanks, Francesco

zckoo007 commented 4 years ago

Thanks for your prompt reply. It's really a good tool

mcahn commented 4 years ago

I've finally gotten back to this. I installed the latest github version today and ran the Example 02 Tree of Life example (on the first 10 files) and it worked. Thanks very much.

Best, Matthew

fasnicar commented 4 years ago

Thank you both for reporting this and helping to improve PhyloPhlAn!