bowmanjeffs / paprica

paprica - PAthway PRediction by phylogenetIC plAcement
27 stars 8 forks source link

raxml version needs to be changed in paprica-make_ref.py too #38

Closed blancaverag closed 7 years ago

bowmanjeffs commented 7 years ago

In both cases it looks like you didn't run the command correctly. It should be:

./paprica-run.sh test.bacteria bacteria

...to run the file test.bacteria.fasta against the bacteria database. Let me know if that works, and also let me know if the instructions were incorrect somewhere so I can get that fixed...

Jeff

---- BVG wrote ----

I am trying to install paprica. I have all dependencies installed (except for archaeopteryx). OS is CentOS 6.8. And I have changed the way RaxML is called in paprica_place_it.py, as AVX-2 version would not built. I am encountering this error:

./paprica-run.sh Traceback (most recent call last): File "/home/programas/paprica/paprica-place_it.py", line 319, in place(cwd + query, ref, ref_dir_domain, cm) File "/home/programas/paprica/paprica-place_it.py", line 183, in place clean_name(query) File "/home/programas/paprica/paprica-place_it.py", line 139, in clean_name for record in SeqIO.parse(file_name + '.fasta', 'fasta'): File "/usr/local/lib/python2.7/site-packages/biopython-1.65-py2.7-linux-x86_64.egg/Bio/SeqIO/init.py", line 572, in parse with as_handle(handle, mode) as fp: File "/usr/local/lib/python2.7/contextlib.py", line 17, in enter return self.gen.next() File "/usr/local/lib/python2.7/site-packages/biopython-1.65-py2.7-linux-x86_64.egg/Bio/File.py", line 90, in as_handle with open(handleish, mode, _kwargs) as fp: IOError: [Errno 2] No such file or directory: '/home/programas/paprica/-ref.fasta' Traceback (most recent call last): File "/home/programas/paprica/paprica-tally_pathways.py", line 118, in genome_data = pd.DataFrame.from_csv(ref_dir_domain + 'genome_data.final.csv', header = 0, index_col = 0) File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 1027, in from_csv infer_datetime_format=infer_datetime_format) File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 465, in parser_f return _read(filepath_or_buffer, kwds) File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 241, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 557, in init self.make_engine(self.engine) File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 694, in make_engine self._engine = CParserWrapper(self.f, _self.options) File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 1061, in init self.reader = parser.TextReader(src, **kwds) File "pandas/parser.pyx", line 350, in pandas.parser.TextReader.cinit (pandas/parser.c:3143) File "pandas/parser.pyx", line 583, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:5765) IOError: File /home/programas/paprica/ref_genome_database//genome_data.final.csv does not exist

I have been trying to run it in Virtual Box, but a very similar error occurs:

demo@paprica-demo:~/paprica$ ./paprica-run.sh Traceback (most recent call last): File "/home/demo/paprica/paprica-place_it.py", line 318, in place(cwd + query, ref, ref_dir_domain, cm) File "/home/demo/paprica/paprica-place_it.py", line 182, in place clean_name(query) File "/home/demo/paprica/paprica-place_it.py", line 138, in clean_name for record in SeqIO.parse(file_name + '.fasta', 'fasta'): File "/home/demo/anaconda2/lib/python2.7/site-packages/Bio/SeqIO/init.py", line 583, in parse with as_handle(handle, mode) as fp: File "/home/demo/anaconda2/lib/python2.7/contextlib.py", line 17, in enter return self.gen.next() File "/home/demo/anaconda2/lib/python2.7/site-packages/Bio/File.py", line 90, in as_handle with open(handleish, mode, _kwargs) as fp: IOError: [Errno 2] No such file or directory: '/home/demo/paprica/-ref.fasta' Traceback (most recent call last): File "/home/demo/paprica/paprica-tally_pathways.py", line 118, in genome_data = pd.DataFrame.from_csv(ref_dir_domain + 'genome_data.final.csv', header = 0, index_col = 0) File "/home/demo/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 1177, in from_csv infer_datetime_format=infer_datetime_format) File "/home/demo/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 498, in parser_f return _read(filepath_or_buffer, kwds) File "/home/demo/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 275, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/home/demo/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 590, in init self.make_engine(self.engine) File "/home/demo/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 731, in make_engine self._engine = CParserWrapper(self.f, _self.options) File "/home/demo/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 1103, in init self.reader = parser.TextReader(src, **kwds) File "pandas/parser.pyx", line 353, in pandas.parser.TextReader.cinit (pandas/parser.c:3246) File "pandas/parser.pyx", line 591, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:6111) IOError: File /home/demo/paprica/ref_genome_database//genome_data.final.csv does not exist

What am I missing? Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/bowmanjeffs/paprica","title":"bowmanjeffs/paprica","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/bowmanjeffs/paprica"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"paprica-run.sh error (#38)"}],"action":{"name":"View Issue","url":"https://github.com/bowmanjeffs/paprica/issues/38"}}}

blancaverag commented 7 years ago

Thank you, I realized and was running the rest of the installation test and changing the issue with the new errors.

As suggested in the manual, and as I could not install the appropiate RAxML version, I changed the RAxML version in paprica-place_it.py, but I found that was not enough; paprica-make_ref.py also calls it.

I am now running into this other error:

paprica-build.sh bacteria ... raxmlHPC -T 2 -m GTRGAMMA -f J -p 12345 -t /home/programas/paprica/ref_genome_database/bacteria/RAxML_rootedTree.root.ref.tre -n conf.root.ref.tre -s /home/programas/paprica/ref_genome_database/bacteria/combined_16S.bacteria.tax.clean.align.fasta -w /home/programas/paprica/ref_genome_database/bacteria/

Time after model optimization: 0.151178 Initial Likelihood -6021.645307

NNI interchanges 0 Likelihood -6021.645165

Final Likelihood of NNI-optimized tree: -6021.645165

RAxML NNI-optimized tree written to file: /home/programas/paprica/ref_genome_database/bacteria/RAxML_fastTree.conf.root.ref.tre

Same tree with SH-like supports written to file: /home/programas/paprica/ref_genome_database/bacteria/RAxML_fastTreeSH_Support.conf.root.ref.tre

Total execution time: 0.234270 Traceback (most recent call last): File "/usr/local/bin/taxit", line 4, in import pkg_resources File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 2805, in working_set.require(requires) File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 696, in require needed = self.resolve(parse_requirements(requirements)) File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 594, in resolve raise DistributionNotFound(req) pkg_resources.DistributionNotFound: xlrd cp: no se puede crear el fichero regular «/home/programas/paprica/ref_genome_database/bacteria/combined_16S.bacteria.tax.refpkg/combined_16S.bacteria.tax.clean.align.sto»: No existe el fichero o el directorio cmalign :: align sequences to a CM INFERNAL 1.1.2 (July 2016) ...

xlrd is installed at /usr/local/bin/anaconda.

bowmanjeffs commented 7 years ago

It looks like you're running into a problem with taxit. The error suggests to me that taxit is running with your system Python and not Anaconda. That shouldn't be a problem so long as taxit installed correctly, however, it appears that it did not. Try re-installing taxit, possibly using the Anaconda distro to install.