biobakery / phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
https://huttenhower.sph.harvard.edu/phylophlan
MIT License
123 stars 33 forks source link

[BUG] `phylophlan_configs` folder does not exists #73

Open Mxrcon opened 2 years ago

Mxrcon commented 2 years ago

Hey there! I think that i found a bug in the latest version. I'm using this docker image : quay.io/biocontainers/phylophlan:3.0.2--py_0 and I'm running phylophlan like this :

phylophlan -i test   -d phylophlan  --diversity low -f  supertree_nt.cfg --nproc 2    

And i'm recieving this error:

"/usr/local/lib/python3.9/site-packages/phylophlan/phylophlan_configs/" folder does not exists                                                             
           Traceback (most recent call last):                                                                                                                          
             File "/usr/local/bin/phylophlan", line 10, in <module>                                                                                                    
               sys.exit(phylophlan_main())                                                                                                                             
             File "/usr/local/lib/python3.9/site-packages/phylophlan/phylophlan.py", line 3194, in phylophlan_main                                                     
               project_name = check_args(args, sys.argv, verbose=args.verbose)                                                                                         
             File "/usr/local/lib/python3.9/site-packages/phylophlan/phylophlan.py", line 493, in check_args                                                           
               elif os.path.isfile(os.path.join(args.configs_folder, args.config_file)):                                                                               
             File "/usr/local/lib/python3.9/posixpath.py", line 76, in join                                                                                            
               a = os.fspath(a)                                                                                                                                        
           TypeError: expected str, bytes or os.PathLike object, not NoneType      

I'm not sure about what is happening, I'd be very happy to contribute with more information if necessary! Best Regards, Davi

fasnicar commented 2 years ago

Dear Davi, many thanks for reporting this. Can you please provide the PhyloPhlAn version you're using with:

phylophlan --version

And also if you can it would be helpful to have the full standard output using the --verbose parameter.

Many thanks, Francesco

Mxrcon commented 2 years ago

Thank's for your response, and sorry for my delay into my response. Here goes the output for phylophlan --version : PhyloPhlAn version 3.0.60 (27 November 2020)

Outputs:

/usr/local/bin/phylophlan -i test -d phylophlan --diversity low -f supertree_nt.cfg --nproc 2 --verbose

Creating folder "phylophlan_databases/"
Automatically setting "input=test" and "input_folder=/tmp/tmpxs_zsnjw/d4/a8b4a813752560e07b29f0bf9c0b22"
Creating folder "test_phylophlan"
Creating folder "test_phylophlan/tmp"
"low-accurate" preset
Setting "sort=True" because "database=phylophlan"
Setting "min_num_markers=100" since no value has been specified and the "database=phylophlan"

stderr:

            "/usr/local/lib/python3.9/site-packages/phylophlan/phylophlan_configs/" folder does not exists                                                             
           Traceback (most recent call last):                                                                                                                          
             File "/usr/local/bin/phylophlan", line 10, in <module>                                                                                                    
               sys.exit(phylophlan_main())                                                                                                                             
             File "/usr/local/lib/python3.9/site-packages/phylophlan/phylophlan.py", line 3194, in phylophlan_main                                                     
               project_name = check_args(args, sys.argv, verbose=args.verbose)                                                                                         
             File "/usr/local/lib/python3.9/site-packages/phylophlan/phylophlan.py", line 493, in check_args                                                           
               elif os.path.isfile(os.path.join(args.configs_folder, args.config_file)):                                                                               
             File "/usr/local/lib/python3.9/posixpath.py", line 76, in join                                                                                            
               a = os.fspath(a)                                                                                                                                        
           TypeError: expected str, bytes or os.PathLike object, not NoneType  
fasnicar commented 2 years ago

Dear Davi,

Many thanks for the outputs. I do believe this is fixed in the latest version that is only in Github and not in Bioconda yet. Can I ask you if you can get PhyloPhlAn from the Github repo and test that version? Then if everything work fine I can think to push a new version in Bioconda as well.

Many thanks, Francesco

Mxrcon commented 2 years ago

Yes, Sure, I'll test with the latest version. It'd be great if you update the Bioconda recipe, I'll certainly be happy to use it!

Thankfully, Davi

rpetit3 commented 2 years ago

I think an updated version on Bioconda would be ideal, but if you think its worthwhile I can temporary add a patch to the current Bioconda version. Just point me in the right direction

fasnicar commented 2 years ago

Yes, agree, only thing is before pushing a new version to Bioconda I'm waiting to collect enough commits/edits (which probably is the case now). So, if Davi is able to test PhyloPhlAn from the repo and it works, I will push a new package in conda. But suppose we find other issues, then I prefer to fix them before pushing a new version in Bioconda. Does it make sense?

Re your question Robert:

I think an updated version on Bioconda would be ideal, but if you think its worthwhile I can temporary add a patch to the current Bioconda version. Just point me in the right direction

The patch here would be applying all the commits after the 3.0.2 tag, but probably it is easier to just pull the repo?

Many thanks, Francesco

rpetit3 commented 2 years ago

Makes total sense, and I am in full agreement with you!

Mxrcon commented 2 years ago

I've replicated the error using the latest version: phylophlan --version output

PhyloPhlAn version 3.0.64 (8 July 2021)

pylophlan output:

PhyloPhlAn version 3.0.64 (8 July 2021)

Command line: /usr/local/bin/phylophlan -i test -d phylophlan --diversity low -f supertree_nt.cfg --nproc 2 --verbose

Automatically setting "database=phylophlan" and "databases_folder=/tmp/tmpxs_zsnjw/d4/a8b4a813752560e07b29f0bf9c0b22"
Automatically setting "input=test" and "input_folder=/tmp/tmpxs_zsnjw/d4/a8b4a813752560e07b29f0bf9c0b22"
[e] "/usr/local/lib/python3.8/dist-packages/PhyloPhlAn-3.0.2-py3.8.egg/phylophlan/phylophlan_configs/" folder does not exists
"low-accurate" preset
Setting "sort=True" because "database=phylophlan"
Setting "min_num_markers=100" since no value has been specified and the "database=phylophlan"
Traceback (most recent call last):
  File "/usr/local/bin/phylophlan", line 11, in <module>
    load_entry_point('PhyloPhlAn==3.0.2', 'console_scripts', 'phylophlan')()
  File "/usr/local/lib/python3.8/dist-packages/PhyloPhlAn-3.0.2-py3.8.egg/phylophlan/phylophlan.py", line 3206, in phylophlan_main
    project_name = check_args(args, sys.argv, verbose=args.verbose)
  File "/usr/local/lib/python3.8/dist-packages/PhyloPhlAn-3.0.2-py3.8.egg/phylophlan/phylophlan.py", line 493, in check_args
    elif os.path.isfile(os.path.join(args.configs_folder, args.config_file)):
  File "/usr/lib/python3.8/posixpath.py", line 76, in join
    a = os.fspath(a)

I took attention to use conda deactivate to be sure that conda wasn't interfering in the results

Thanks, Davi

fasnicar commented 2 years ago

Thanks Davi.

So, my interpretation here is the following:

Now, this is a bit strange because the args.configs_folder is checked and set way earlier than the line that is failing in your case. Can you see whether it could be a problem with reading/writing permissions of the config folder? You can do this by editing the PhyloPhlAn code adding a print of the args.configs_folder variable right before the line that is failing so that you know which config folder in your system is considered and then you can verify the permission of that folder.

Many thanks, Francesco

Mxrcon commented 2 years ago

Hey, After some testings, I found that I needed to setup the dependency manually, and generate the default config files again, and after that, phylophlan started to work without that error, I'm getting this error now [e] both db_dna and db_aa are None!, I installed the dependencys using local instalatsion.

Shoudn't the conda or docker instalation contain all this dependencys? I'm not sure about what caused the error.

Thanks, Davi.

fasnicar commented 2 years ago

Thanks Davi!

Which dependencies were missing? (They shouldn't as they are specified in the package, but I'm curious to understand what could be the problem here)

About the [e] both db_dna and db_aa are None! errors, that's because you don't have either one or the other section in the config file, which instead at least one of the two should be present. How the config file was created?

Thanks, Francesco

Mxrcon commented 2 years ago

Hey,

Which dependencies were missing?

I was missing, Diamond and mafft, I downloaded them manually and it worked!

How the config file was created?

Using the create_default_configs script, that comes with phylophlan

Thanks, Davi

fasnicar commented 2 years ago

Thanks! So, that's really strange that you were missing those tools, but that explains the error, the config file for some reason was wrongly created without the twos sections because Diamond was not found. If you now installed the two missing dependencies, can you re-create the default configs and try again? At that point, I think everything should work.

Thanks, Francesco

ValterAlmeida commented 2 years ago

Hi everyone,

I am facing a similar issue with this one. I have been trying to use Phylophlan 3 in a supercomputer system, and it hasn't worked yet.

The PhyloPhlAn version installed is PhyloPhlAn version 3.0.60 (27 November 2020) and the command that I am trying to run is this one:

phylophlan -i metabat.294 -d /opt/nesi/db/PhyloPhlAn --diversity medium -f configuration_file/supertree_nt.cfg

The error that I receive changes as I try different variations of commands, but, in general, I get this error:

[e] "/opt/nesi/CS400_centos7_bdw/PhyloPhlAn/3.0.2-gimkl-2020a-Python-3.9.9/lib/python3.9/site-packages/phylophlan/phylophlan_configs/" folder does not exists Traceback (most recent call last): File "/opt/nesi/CS400_centos7_bdw/PhyloPhlAn/3.0.2-gimkl-2020a-Python-3.9.9/bin/phylophlan", line 8, in sys.exit(phylophlan_main()) File "/opt/nesi/CS400_centos7_bdw/PhyloPhlAn/3.0.2-gimkl-2020a-Python-3.9.9/lib/python3.9/site-packages/phylophlan/phylophlan.py", line 3226, in phylophlan_main db_type, db_dna, db_aa = init_database(args.database, args.databases_folder, args.db_type, configs, 'db_dna', 'db_aa', File "/opt/nesi/CS400_centos7_bdw/PhyloPhlAn/3.0.2-gimkl-2020a-Python-3.9.9/lib/python3.9/site-packages/phylophlan/phylophlan.py", line 817, in init_database d = Counter([len(set(seq)) File "/opt/nesi/CS400_centos7bdw/PhyloPhlAn/3.0.2-gimkl-2020a-Python-3.9.9/lib/python3.9/site-packages/phylophlan/phylophlan.py", line 819, in for , seq in SimpleFastaParser(bz2.open(f, 'rt') if f.endswith('.bz2') else open(f))]) IsADirectoryError: [Errno 21] Is a directory: '/opt/nesi/db/PhyloPhlAn/phylophlan'

Could you please help me find out what is going on or point me where I can find the solutions? I've tried recreating the configuration files, but the error persists.

I appreciate your time and attention!

Kind regards, Valter

fasnicar commented 2 years ago

Hi Valter, Many thanks for reporting this. As currently there is a slightly newer version in the repository, will you be able to pull PhyloPhlAn from the repo and run the last version? If the problem still persists, it will be then much easier for me to debug it. We are also checking and finalizing a couple of other stuff and after that, we'll package this newer version to Bioconda as well.

Many thanks, Francesco

ValterAlmeida commented 2 years ago

Hi Francesco,

We have installed it using the version 3.0.2 link, but when we run the command phylophlan -v, it appears version 3.0.6. I will keep an eye on the phylophlan GitHub webpage to find the following updates available.

image

Thank you so much for your attention and your quick reply. Have a great week ahead!

Kind regards, Valter

fasnicar commented 2 years ago

Hi Valter, yes, that's correct (sorry it might be confusing a bit). 3.0.2 is the version of the PhyloPhlAn package in Bioconda, which is not the same as the version of the PhyloPhlAn code (indeed the PhyloPhlAn version of the 3.0.2 Bioconda package is the 3.0.60). Now, the latest version of the PhyloPhlAn code is 3.0.64, so if you can pull that code and re-run your command using this latest version it would be very helpful.

Many thanks, Francesco

HackenDirker commented 1 year ago

I'm running into the same issue as above with the most recent release on bioconda. I've also used the built-in script to download the default config files. The output from the command with the --verbose flag is as follows:

phylophlan -i genomes/ -d phylophlan --diversity low --accurate --nproc 16 --config_file /workspace/home/hackenbd/resources/phylophlan/config_files/supermatrix_nt.cfg --verbose
PhyloPhlAn version 3.0.67 (24 August 2022)

Command line: /workspace/home/hackenbd/miniconda3/envs/phylophlan/bin/phylophlan -i genomes/ -d phylophlan --diversity low --accurate --nproc 16 --config_file /workspace/home/hackenbd/resources/phylophlan/config_files/supermatrix_nt.cfg --verbose

Automatically setting "input=genomes" and "input_folder=/workspace/home/hackenbd/projects/find_amr_gene/data/genome_alignment"
[w] "/workspace/home/hackenbd/miniconda3/envs/phylophlan/lib/python3.10/site-packages/phylophlan/phylophlan_configs/" folder does not exists
"low-accurate" preset
Setting "sort=True" because "database=phylophlan"
Setting "min_num_markers=100" since no value has been specified and the "database=phylophlan"

Arguments: {'input': 'genomes', 'clean': None, 'output': 'genomes_phylophlan', 'database': 'phylophlan', 'db_type': None, 'config_file': '/workspace/home/hackenbd/resources/phylophlan/config_files/supermatrix_nt.cfg', 'diversity': 'low', 'accurate': True, 'fast': False, 'clean_all': False, 'database_list': False, 'submat': 'pfasum60', 'submat_list': False, 'submod_list': False, 'nproc': 16, 'min_num_proteins': 1, 'min_len_protein': 50, 'min_num_markers': 100, 'trim': 'not_variant', 'gap_perc_threshold': 0.67, 'not_variant_threshold': 0.99, 'subsample': None, 'unknown_fraction': 0.3, 'scoring_function': None, 'sort': True, 'remove_fragmentary_entries': False, 'fragmentary_threshold': 0.85, 'min_num_entries': 4, 'maas': None, 'remove_only_gaps_entries': False, 'mutation_rates': False, 'force_nucleotides': False, 'convert_N2gap': False, 'input_folder': '/workspace/home/hackenbd/projects/find_amr_gene/data/genome_alignment/genomes', 'data_folder': 'genomes_phylophlan/tmp', 'databases_folder': 'phylophlan_databases/', 'submat_folder': '/workspace/home/hackenbd/miniconda3/envs/phylophlan/lib/python3.10/site-packages/phylophlan/phylophlan_substitution_matrices/', 'submod_folder': '/workspace/home/hackenbd/miniconda3/envs/phylophlan/lib/python3.10/site-packages/phylophlan/phylophlan_substitution_models/', 'configs_folder': None, 'output_folder': '', 'genome_extension': '.fna', 'proteome_extension': '.faa', 'update': False, 'verbose': True}

Loading configuration file "/workspace/home/hackenbd/resources/phylophlan/config_files/supermatrix_nt.cfg"
Checking configuration file
Checking "/workspace/home/hackenbd/miniconda3/envs/phylophlan/bin/makeblastdb"
Checking "/workspace/home/hackenbd/miniconda3/envs/phylophlan/bin/blastn"
Checking "/workspace/home/hackenbd/miniconda3/envs/phylophlan/bin/mafft"
Checking "/workspace/home/hackenbd/miniconda3/envs/phylophlan/bin/trimal"
Checking "/workspace/home/hackenbd/miniconda3/envs/phylophlan/bin/FastTreeMP"
Checking "/workspace/home/hackenbd/miniconda3/envs/phylophlan/bin/raxmlHPC-PTHREADS-SSE3"
Database folder "phylophlan_databases/phylophlan" present
Database folder "phylophlan_databases/phylophlan" present
[e] both db_dna and db_aa are None!

Are there any fixes that you've been able to find? Let me know if there's anything else you need from me.

Thanks for your time!

All best,

Dirk