pymzml / pymzML

pymzML - an interface between Python and mzML Mass spectrometry Files
https://pymzml.readthedocs.io/en/latest/
MIT License
162 stars 92 forks source link

Error when working with ESI-ISO spectra #121

Closed mukils15 closed 5 years ago

mukils15 commented 5 years ago

The code I have written to average spectra works perfectly fine for the MS1 file that I have, but runs into an error for the MS2-ESI-ISO file. This is the error:

Traceback (most recent call last): File "C:\Users\Mukil\Documents\Python Scripts\average_multiprocessing_attempts.py", line 62, in p.map(getSpectralAverageAndWriteToFile, [mzml_file1, mzml_file2]) File "C:\Users\Mukil\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 268, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "C:\Users\Mukil\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 657, in get raise self._value KeyError: 3

Does anybody know if this is because of a difference in spectra structure for the MS2 file? Or perhaps know what exactly this error is saying and how to ge around it?

MKoesters commented 5 years ago

Hi and thanks for reporting this,

From this traceback, I cant tell you whats happening since the actual pymzML functionnality is probably implemented in your getSpectralAverageAndWriteToFile function. Could you post that code (if possible) or tell me which pymzML function calls you are using? Also, could you supply on spectrum XML element from your MS2 file? The binary data can be stripped out, if you do not want to disclose your data. Sadly, I'll have a lot on my plate currently and I'm on the road a lot, but I hope I can have a look at this on the weekend or the beginning of next week.

Best, Manuel

mukils15 commented 5 years ago

Hi, thank you! Here is the code:

msrun = pymzml.run.Reader(filepath, MS1_Precision=5e-6, MSn_Precision=20e-6)

# Get length of any one spectrum
spec_length = len(msrun[2].mz)
print(spec_length)

# initialize list to hold averages
total_i = [0] * spec_length

# initialize var to count number of spectra in file
numspectra = 0

# compute number of spectra in file and store total intensity at each point in single list
msrun2 = pymzml.run.Reader(filepath, MS1_Precision=5e-6, MSn_Precision=20e-6)
for spectrum in msrun2:
    numspectra += 1
    x = 0
    while x < spec_length:
        total_i[x] = total_i[x] + spectrum.i[x]
        x += 1

# debug print
print(numspectra)

# compute average intensity at each point across all spectra
avg_i = [item / numspectra for item in total_i]
mz_i_tuples = list(zip(spectrum.mz, avg_i))
# print (mz_i_tuples)
plt.plot(spectrum.mz, avg_i)
plt.xlabel("m/z")
plt.ylabel('Intensity')

# write spectra to another file

plt.savefig(filepath + ' .png')`
mukils15 commented 5 years ago

And here is a spectra with the binary data stripped out:

oneSpectra.txt

Thank you in advance!

MKoesters commented 5 years ago

Hi, I had a quick look at your spectrum and it seems to be an MS3 scan, not MS2. Can you try to initialize the reader as the following: msrun = pymzml.run.Reader( filepath, MS1_Precision=5e-6, MSn_Precision=20e-6, MS_precisions={3:20e-6} ) Alternatively: msrun = pymzml.run.Reader( filepath, MS_precisions={1:5e-6, 2:20e-6, 3:20e-6} ) instead of 20e-6 in the dict, just use your ms3 precision.

I hope this fixes it. If you can confim it works, I'll push a fix that makes your approach work asap.

Best, Manuel

mukils15 commented 5 years ago

Hi, Your fix worked perfectly, thank you so much! What exactly does initializing the reader in that fashion do as opposed to what I had originally?

Sincerely, Mukil Shanmugam

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows 10


From: Manuel notifications@github.com Sent: Tuesday, March 19, 2019 5:26:58 AM To: pymzml/pymzML Cc: Mukil Shanmugam; Author Subject: Re: [pymzml/pymzML] Error when working with ESI-ISO spectra (#121)

Hi, I had a quick look at your spectrum and it seems to be an MS3 scan, not MS2. Can you try to initialize the reader as the following: msrun = pymzml.run.Reader( filepath, MS1_Precision=5e-6, MSn_Precision=20e-6, MS_precisions={3:20e-6} ) Alternatively: msrun = pymzml.run.Reader( filepath, MS_precisions={1:5e-6, 2:20e-6, 3:20e-6} ) instead of 20e-6 in the dict, just use your ms3 precision.

I hope this fixes it. If you can confim it works, I'll push a fix that makes your approach work asap.

Best, Manuel

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpymzml%2FpymzML%2Fissues%2F121%23issuecomment-474340876&data=02%7C01%7C%7C85ac9e5799574c8c849e08d6ac662ef0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636885952207081986&sdata=kzHJOFqYzkrmsV3r7kVEZiw0Eh1ruc75ATYWd1YIvI4%3D&reserved=0, or mute the threadhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FArx5p1cxQ0Lr1LFeRnZ6eO1B2Fm2V4TMks5vYNeSgaJpZM4b5C1I&data=02%7C01%7C%7C85ac9e5799574c8c849e08d6ac662ef0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636885952207091991&sdata=s%2Fm8PDBl5nEwcNND40XAxrIugM7xAs1X%2BwixTefkS6s%3D&reserved=0.

MKoesters commented 5 years ago

MS_Precisions is a dict where the key represents your MS level and the value the precision. This way, you can set precisions for every MS level you like using custom values. The precision is required for transforming the mz values and for finding peaks for example.

Setting MSn_Precisions was buggy since it only set MS2 precision, not MS3 as in your file. However I'll merge my pull request to also set MS3 precision when setting MSn_Precision. So your first approach should work also, would be nice if you could confirm that.

Best, Manuel