Chris7 / pyquant

Platform independent command line tool for analysis of mass spectrometry data.
https://chris7.github.io/pyquant/
MIT License
14 stars 6 forks source link

ImportError: No module named cpeaks #22

Closed Vadimvs7 closed 6 years ago

Vadimvs7 commented 7 years ago

Hi, I have a few problems running PyQuant. First of all on Mac osX sierra with python 2.7. I installed PyQuant from PIP together with other required packages. Here is the error that I get when trying to run Pyquant --help:

Traceback (most recent call last): File "/usr/local/bin/pyQuant", line 7, in from pyquant.command_line import run_pyquant File "/usr/local/lib/python2.7/site-packages/pyquant/command_line.py", line 31, in from .worker import Worker File "/usr/local/lib/python2.7/site-packages/pyquant/worker.py", line 35, in from . import peaks File "/usr/local/lib/python2.7/site-packages/pyquant/peaks.py", line 21, in from pyquant.cpeaks import bigauss_func, gauss_func, bigauss_ndim, gauss_ndim, bigauss_jac,\ ImportError: No module named cpeaks __ It did not install properly. So I went on to a win10 PC and tried pyquant in Anaconda and in Docker. Installation in those was successful Pyquant --help works. However, when I try to analyse my data with the following command (adapted from SILAC example from here https://chris7.github.io/pyquant/): docker run -v E:\data\MAXQUANT_RESULTS\SILAC1\combined1\txt:/txt chrismit7/pyquant --search-file /txt/evidence.txt -o /txt/pyquant_results --label-method K6R6 --maxquant

I get this error message:

msparser not found, Mascot DAT files unable to be parsed Loading Scans: Traceback (most recent call last): File "/usr/local/bin/pyQuant", line 11, in load_entry_point('pyquant-ms==0.1.43rc2', 'console_scripts', 'pyQuant')() File "/usr/local/lib/python2.7/dist-packages/pyquant/command_line.py", line 149, in run_pyquant results = GuessIterator(args.search_file.name, full=True, store=False, peptide=peptides) File "/usr/local/lib/python2.7/dist-packages/pythomics/proteomics/parsers.py", line 155, in init self.parser = self.parser(*args, **kwargs) AttributeError: 'GuessIterator' object has no attribute 'parser'

Most probably what happens is the command line lacks all of the arguments needed. In the PyQuant paper the authors used ms_ms text file from maxquant as input. I tried that with the same result: docker run -v E:\data\MAXQUANT_RESULTS\SILAC1\combined1\txt:/txt chrismit7/pyquant --search-file /txt/ms_ms.txt -o /txt/pyquant_results --label-method K6R6 --maxquant

gives same error message as above.

At this point I thought the evidence.txt file that I have is not of the right format somehow (it was generated with maxquant 1.5.2.8) So i tried with the E. coli data from the pyquant.zip file:

docker run -v E:\data\pyquant_example:/pyquant_example chrismit7/pyquant --search-file /pyquant_example/ecoli_mq_124_evidence.txt -o /pyquant_example/pyquant_results.txt --label-method K4K8R6R10 --maxquant

This gives same error as above.

I am perhaps doing something badly wrong, so please pardon my lack of experience and knowledge. Any help is very much appreciated! Vadim

Chris7 commented 7 years ago

Hi @Vadimvs7,

Thanks for the detailed write up! For OSX, I think the install is failing because Cython is not installed prior to installing PyQuant. I'll update the readme to provide that information.

For your general problem though, I have a question and a possible reason this isn't working: 1) Are you providing raw data? And where is that stored? If the raw files are not in the same directory as the search results, you need to indicate where the raw files are with --scan-file or --scan-file-dir 2) For MaxQuant, the search results are passed in using the --tsv flag, since it is not actually a search file like other formats. I think this is misleading and will update PyQuant to look for the --search-file parameter if --maxquant is provided. I also notice this is just flat out wrong in the documentation, I'm very sorry for that and am fixing it now (a new version with this fixed should be on the horizon soon, but probably not for a few weeks)

I hope that helps, let me know if you have any other issues.

Vadimvs7 commented 7 years ago

Dear Chris,

Thanks for your rapid reply and for the info.

The Mac problem I put there just for your information, perhaps other mac users would have the same problem. Cython seems to be installed together with other packages with pip. I do think it might be the osX issue though. If this is of any help, here is what we have found: I did reinstall python with brew and I made sure, brew was working correctly and that the python from brew was used instead of the osX version. Indeed it seems that the cpeak module was not compiled as it perhaps should have been. I do have the cpeaks.c file and it says it was generated with Cython 0.25.2 so supposedly Cython works. I tried to force a rebuild of the supposedly missing module by doing this: export PYQUANT_DEV=True. When I then re-run Pyquant it complained that it could not find arrayobject.h. I do have that file in /usr/local/lib/python2.7/site-packages/numpy/core/include/numpy and the compiler probably expects it somewhere in /usr/include/numpy. At this point I stopped, because as of El Capitan even root cannot write to /usr/include without extra effort. But probably providing the right pathway in the cpeaks.c file for arrayobject.h and for ufuncobject.h could fix this... or at least could stop with a different error.

But again I quit trying to make it to work on osX, so I went with docker, so back to the main problem.

  1. I did not provide the raw data, I was under impression that pyquant will use whatever is available in the .txt output of maxquant for quantification. Perhaps I got it wrong, but I thought that pyQuant can either start from raw data, or use the output of maxquant. Because if we still use the raw data, what is the benefit of using maxquant output in addition? I of course could provide the raw data, but it will be in the proprietary thermo format, and not in the .mzml. Since maxquant is able to read thermo .raw, I never bothered to convert. Would pyquant understand .raw files??? Looking at the .py files of pyquant and pythomics it probably won't, is this correct? If that is the case, I would have to convert them to .mzml and to modify the evidence table (substitute raw for mzml in the table). And if I have to convert from .raw to .mzml, can I do it with msconvert from ProteoWizard tool using 64bit coding for the mass values? (apparently 32 bit coding gives bigger .mzml file size)
  2. Actually after writing here my original post, I looked at the .py files and thought that it's likely that --tsv flag is required for. But I did not quite get your explanation, because I did provide the --search-file flag together with the evidence.txt in addition to --maxquant flag.

So the right way would be something like this then (provided that raw files referenced in evidence.txt would be in the same /txt directory)? docker run -v E:\data\MAXQUANT_RESULTS\SILAC1\combined1\txt:/txt chrismit7/pyquant --tsv --search-file /txt/evidence.txt -o /txt/pyquant_results --label-method K6R6 --maxquant

Also if that is not a big ask, what would be the flags to try pyquant analyis of raw SILAC data without any maxquant stuff (I have a 1:1 mix of Light and K6R6)? My raw data was generated with an orbitrap instrument, similar to the one used in the pyQuant paper (we use orbitrap Fusiuon Lumous, but the flags must be very similar to the ones to analyse data from LTQ-Orbitrap Elite).

I would cautiously guess pyquant --scan-file-dir --label-method K6R6 should do it????

Thanks a lot once again for your time!

Vadim

Chris7 commented 7 years ago

Hi @Vadimvs7,

You want to provide the search file as the --tsv file, so: docker run -v E:\data\MAXQUANT_RESULTS\SILAC1\combined1\txt:/txt chrismit7/pyquant --tsv /txt/evidence.txt -o /txt/pyquant_results --label-method K6R6 --maxquant. The reason to use MaxQuant is that it runs Andromeda to match spectra with peptides -- if I was able to search solely with Andromeda, I would support just that output so it can be parsed easily by PyQuant. The idea is that you have raw data, and an annotation on top of it (peptides in this case) that provide context that can be used for some insights. The reason there is a generic --tsv input is that suppose there is a new search engine that PyQuant does not natively support -- you do not need to wait for PyQuant to bake support in, you can simply create a tsv file with the needed information and run PyQuant with that new great search engine.

With the input, it will automatically handle files that end with .raw and look for the corresponding mzml file. The downside as you said, is that you do need to convert to mzml files. This is a problem with Thermo in that they provide paltry support for parsing their RAW files in windows, and absolutely no support for other OS's.

Chris

Chris7 commented 6 years ago

I assume this issue is resolved @Vadimvs7, let me know otherwise!