faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
80 stars 49 forks source link

help with phyluce_genetrees_run_raxml_genetrees #73

Closed RominaSSBatista closed 7 years ago

RominaSSBatista commented 7 years ago

I am trying to use phyluce_genetrees_run_raxml_genetrees, but I am frequently getting errors. If I use: phyluce_genetrees_run_raxml_genetrees \ --input /../../../../../../mafft-phylip-min-25-taxa-complete-PIS/ \ --tree-searches 20 \ --output /../../../../../../raxml/ \ --outgroup U1097 \ --cores 1 \ --threads 1

I will get this error:

Traceback (most recent call last): File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 186, in <module> main() File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 176, in main trees = map(run_raxml, work) File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 136, in run_raxml seconds = time.search(stdout).groups()[0] AttributeError: 'NoneType' object has no attribute 'groups'


If I use:

phyluce_genetrees_run_raxml_genetrees \ --input /../../../../../../mafft-phylip-min-25-taxa-complete-PIS/ \ --tree-searches 20 \ --output /../../../../../../raxml/ \ --outgroup U1097 \ --cores 12\ --threads 1

I will get this error:

Traceback (most recent call last): File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 186, in <module> main() File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 161, in main args.threads) EOFError: EOF when reading a line


I would appreciate any help. Thanks!

carloliveros commented 7 years ago

Hi Romina,

There is usually no need to use the multi-threaded version of RAxML when inferring gene trees so use threads=1 and to take advantage of multiprocessing, use cores>1, the more cores you use, the faster the script will complete. You could also use "--quiet" to suppress the CPU usage question.

It looks to me like RAxML is not completing or not running at all. Make sure that RAxML is installed correctly in your system, i.e., if you type in "raxmlHPC-SSE3 -h" on the command line, it should run and display the command line options. Also make sure that the raxml command like the one below runs successfully:

raxmlHPC-SSE3 -m GTRGAMMA -n best -s one.of.your.alignment.files -N 20 -p 12345 -w your.directory --no-bfgs

If the RAxML code above does not work, then you may need to tweak the code under the function get_basic_raxml.

Cheers Carl

RominaSSBatista commented 7 years ago

Hi @carloliveros, thanks for your reply. I am used to run RAxML. Everything seems to be fine with the program. I changed the cmmd line by adding --quiet as you advice. I still get the following error: Traceback (most recent call last): File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 186, in <module> main() File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 174, in main trees = pool.map(run_raxml, work) File "/usr/local/packages/anaconda2/lib/python2.7/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/usr/local/packages/anaconda2/lib/python2.7/multiprocessing/pool.py", line 567, in get raise self._value AttributeError: 'NoneType' object has no attribute 'groups'

Even though I got some folders displayed as output. They were, unfortunately, empty. I am checking If I miss any module/program, since I am submitting this script as a job in a queue. I will let you know, thanks. Best, Romina B.

brantfaircloth commented 7 years ago

it may be that the submission to the queue is causing the problems. it looks like RAxML is not running... can you run this on a local computer?

carloliveros commented 7 years ago

Romina,

It sounds like you have access to an HPC environment. If so, also consider using the GNU parallel command if it is supported in your system. It provides a very efficient way to run several processes simultaneously with different input files, which is essentially what this PHYLUCE module does (running several instances of RAxML).

Cheers Carl

RominaSSBatista commented 7 years ago

Hi @brantfaircloth and @carloliveros, thanks for your help.

1. I've tried using my local machine, and I got the same error: 2017-06-21 10:29:27,321 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --cores: 3 2017-06-21 10:29:27,321 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --input: /home/haffer/Documents/Romina/PIS_draft/mafft-phylip-min-25-taxa-PIS 2017-06-21 10:29:27,321 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --log_path: None 2017-06-21 10:29:27,322 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --outgroup: U1097 2017-06-21 10:29:27,322 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --output: /home/haffer/Documents/Romina/PIS_draft/raxml2 2017-06-21 10:29:27,322 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --quiet: True 2017-06-21 10:29:27,322 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --threads: 1 2017-06-21 10:29:27,322 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --tree_searches: 20 2017-06-21 10:29:27,322 - phyluce_genetrees_run_raxml_genetrees - INFO - Argument --verbosity: INFO 2017-06-21 10:29:27,328 - phyluce_genetrees_run_raxml_genetrees - INFO - 483 alignments read Traceback (most recent call last): File "/home/haffer/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 186, in <module> main() File "/home/haffer/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 174, in main trees = pool.map(run_raxml, work) File "/home/haffer/anaconda2/lib/python2.7/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/home/haffer/anaconda2/lib/python2.7/multiprocessing/pool.py", line 567, in get raise self._value AttributeError: 'NoneType' object has no attribute 'groups'

I am sure I have RAxML installed in my local machine. If I get any help like as raxmlHPC-SSE3 --help it will display the msg fine. Do you think it could be the Python version? Thanks.

carloliveros commented 7 years ago

Romina,

Although you got the same error message, it was encountered in a different part of the code, and was likely caused by something else. Can you try specifying the full path in your input and output folders on your HPC machine? Input names that start with "/../../../" are always invalid because of the first slash.

If that still gives you an error, try executing a RAxML run with the full command as I explained above. Sometimes, even if RAxML is correctly installed, it does not like some of the arguments (e.g., "--no-bfgs" in older versions).

Cheers Carl

RominaSSBatista commented 7 years ago

Hi @carloliveros, I used "/../../../" only here to make it clean. The "real" script looks like: #$ -cwd #$ -q node0 #$ -pe mpich 12 #$ -S /bin/bash

module load Anaconda2/v2.5.0 module load Python/v.2.7.11 module load Phyluce/v.X.X module load RaxML/v8.2.9

/usr/local/packages/anaconda2/bin/python2.7 \ /usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees \ --input /state/partition4/romina/contigs_ABC/mapped_ABC_4/taxon-set4/mafft-phylip-min-25-taxa-PIS/ \ --tree-searches 20 \ --output /state/partition4/romina/contigs_ABC/mapped_ABC_4/taxon-set4/raxml_phyluce/ \ --outgroup U1097 \ --quiet \ --log-path /state/partition4/romina/contigs_ABC/mapped_ABC_4/taxon-set4/log \ --cores 12 \ --threads 1


For the script specified above I got the following error:


Traceback (most recent call last): File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 186, in <module> main() File "/usr/local/packages/anaconda2/bin/phyluce_genetrees_run_raxml_genetrees", line 174, in main trees = pool.map(run_raxml, work) File "/usr/local/packages/anaconda2/lib/python2.7/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/usr/local/packages/anaconda2/lib/python2.7/multiprocessing/pool.py", line 567, in get raise self._value AttributeError: 'NoneType' object has no attribute 'groups'


When I tried the cmmd line as you advice, I got this:


$ module load RaxML/v8.2.9 $ raxmlHPC-SSE3 -m GTRGAMMA -n best -s /state/partition4/romina/contigs_ABC/mapped_ABC_4/taxon-set4/mafft-phylip-min-25-taxa-PIS/uce-991.phylip -N 20 -p 12345 -w /state/partition4/romina/contigs_ABC/mapped_ABC_4/taxon-set4/test_carloliveros --no-bfgs : illegal option -- -


There are some issues about the --no-bfgs.


at the end I tried -no-bfgs (slightly different from --)


Finally it worked:

RAxML was called as follows:

raxmlHPC-SSE3 -m GTRGAMMA -n best -s /state/partition4/romina/contigs_ABC/mapped_ABC_4/taxon-set4/mafft-phylip-min-25-taxa-PIS/uce-991.phylip -N 20 -p 12345 -w /state/partition4/romina/contigs_ABC/mapped_ABC_4/taxon-set4/test_carloliveros -no-bfgs

Partition: 0 with name: No Name Provided Base frequencies: 0.314 0.165 0.190 0.331

`Inference[0]: Time 40.566578 GAMMA-based likelihood -4261.957998, best rearrangement setting 10``


But I would like to have it running for all alignments.
Thanks a lot Carl.

carloliveros commented 7 years ago

Romina,

What you want to do now to get this working for all alignments is edit phyluce_genetrees_run_raxml_genetrees to change "--no-bfgs" to "-no-bfgs".

Cheers Carl

RominaSSBatista commented 7 years ago

Carl,

I changed the script. After change "--no-bfgs" to "-no-bfgs":

Now, everything is OK! Thanks a lot for helping me to solve it. All the best, Romina B.