DICP-1809 / GP-Plotter

A flexible tool for the visualization of proteomics and glyco-proteomics mass spectrometry data.
Apache License 2.0
0 stars 0 forks source link

java.lang.NullPointerException #1

Open yxzwang opened 1 year ago

yxzwang commented 1 year ago

Hi, I recently try GO-plotter for drawing spectra, and it crashed into this error when loading data. Could you please fix this bug? Many thanks. image

DICP-1809 commented 1 year ago

Thanks for your feedback! I checked the source code and found that the exception was caused by missing selection ion information in the precursor section of mzML file. I have updated the software and removed mandatory requirement of precursor information. Please find the updated GP-Plotter software in the release page, and please contact us if there is any questions.

yxzwang commented 1 year ago

Hi, Thanks! I got a new error called java.lang.IllegalArgumentException: Failed to Find Scan Number in mzML Spectrum Id: index=2894. Does it mean that the mzmL file should not contain spectrum id not in the identification list. image

Also, I tried only one spectrum with two identification results to plot mirro images. The loading seems to be successful but no content is displayed. The file I used is attached below. plotter.zip

DICP-1809 commented 1 year ago

I think I get what the problem is: the mzML was converted from mgf, right? Snipaste_2023-05-20_21-07-52

It is not recommended, because many spectrum information was lost in mgf, and caused the above first and second errors in mzML reading (please see picture below, scan number 12517 only exists in mgf spectrum title). The scan number was missing in mgf, and spectrum title was also re-assigned in mgf to mzML conversion. Information is more comprehensive when mzML is converted from Raw file. Snipaste_2023-05-20_21-07-09

Below picture is information in mzML file: Snipaste_2023-05-20_21-09-56

For your convenience, I updated GP-Plotter again in the release page to support mgf file reading, even though mzML is more standard. In the updated software, the scan number is obtained from title part "file name.scan.scan.charge", like "MouseLiver-Z-T-3.12517.12517.4" in plotter_mgf_test.mgf.

When reading the plotter_mgf_test file, the result was empty. The reason is quite simple: the spectrum file name is "MouseLiver-Z-T-3" in pGlyco txt, but the name of mgf file is "plotter_mgf_test", thus the software can not find where the spectrum is. The spectrum file name is based on the information in result file, because some search engines support multi-file searching. To fix this bug, you can simply change the name "plotter_mgf_test.mgf" to "MouseLiver-Z-T-3.mgf". Below is the mirror image that GP-Plotter generated.

I hope the above answers can help you.

MouseLiver-Z-T-3-12517_MouseLiver-Z-T-3-12517

yxzwang commented 1 year ago

Hi, Thank you! I think it's almost done but I got one weird issue that I can't save my image. I have the "save as" page by clicking "save" but there is no picture in the saving folder. image

DICP-1809 commented 1 year ago

Does software path or image save path contains special character? Please avoid using Chinese character in the path of software and image, because Chinese and English characters are different encoding ways.

yxzwang commented 1 year ago

There is no Chinese character or other special character there.

DICP-1809 commented 1 year ago

Is python already installed? You can press "window⊞" key, enter "cmd" and press enter key. In cmd page, enter "pip list" and press enter key. If python environment is ok, you will see information like below picture: image

yxzwang commented 1 year ago

I have got all these packages. Some packages have different version, like setuptools 63.4.1 but most are similar. I have much more packages than these.

DICP-1809 commented 1 year ago

When you run GP-Plotter, is there any stop or error at "Python Package Validatation" page ?

yxzwang commented 1 year ago

image it is like this

yxzwang commented 1 year ago

And this page can vanish automatically like it goes successful.

DICP-1809 commented 1 year ago

The "Python Package Validation" page can close automatically, indicates that python environment and necessary packages exist on your computer. I re-tested GP-Plotter on my computer, my laptop and two lab computers, and it worked well......So it is really weird. Below zip file contains spectrum data of "MouseLiver-12517" in json. The source python code for image generation is in the configuration folder of GP-Plotter ("configuration\gp-plotter\1.0.0\python\py"). You can change the json path in "plot_mirror.py" and run the code. If there is any error in image generation, you can see detailed information in cmd page. spectrum json file.zip

yxzwang commented 1 year ago

I see the problem. It got an error " File "E:\tools\GP-plotter\configuration\gp-plotter\1.0.0\python\py\plot_mirror.py", line 6, in from py import mirror_spectrum ImportError: cannot import name 'mirror_spectrum' from 'py' "

The Python interpreter searches for modules in a specific order based on the search path. The order of paths in sys.path determines the search order. I got another py in the sys.path and it result into error.

I change the "py" name into "py_plotter" and fix the bug. Maybe you can change the py into something else. Otherwise, it might contradict anaconda even there is the working path set.

Again, many thanks for your help!

DICP-1809 commented 1 year ago

Thanks for your suggestion! The py folder name will be changed soon. And updates will be released later in today. Thanks a lot!

yxzwang commented 1 year ago

I found that the netural loss (H2O, NH3) peaks for B/Y ions and peptide precursor ions are not matched.

For example, for the glycopeptide [CVVHYEJSTVPEKK 1.Car. (N(N(H(H(N(H(G))))(H(N(H(G))))))) ], the HexNeuGc(-H2O) has a mz of 452.14 but is not matched. The CVVHYEJSTVPEKK (2+) (845.42 m/z) is not matched too.

All the ions not matched are: 452.1395569 2618.0759277344 655.2207031 1388.4241943359 836.9050903 2818.5266113281 845.4172974 9855.546875 937.9440308 3029.2897949219 938.4454346 4688.0502929688 1039.984619 1399.3388671875 1672.780518 5002.8823242188 1689.816406 9887.896484375 1874.924805 1187.0699462891 1875.853394 3091.0866699219

The default setting is : image

Could you please fix the problem or direct me to the source code python file that I can self-define the ion type?

DICP-1809 commented 1 year ago

The software was coded in Java language with dependency of multiple self-developed packages, python is only used for image generation, thus it is hard to share the source code to you. NH3/H2O neutral loss ions are only enumerated and matched for peptide ions and not for B/Y ions, The reasons are below:

  1. The common oxonium ions contain those with neutral loss other than NH3/H2O, for example, HexNAc_C2H4O2 (m/z=144.064). There is an oxonium ion list for GP-Plotter and you can find the list in configuration folder (configuration\oxoniumIon.csv) and self-define the list to include other oxonium ions like HexNeuGc(-H2O).
  2. Neutral loss ions are not considered for Y ions, there are two reasons: 1) m/z of fucose-containing ions is close to the isotopic peak of H2O/NH3 loss peak, for example, [Hex(2)-NH3 +1 Da] = Hex(1)Fuc(1). And it is necessary to avoid incorrect ion annotation; 2) matching ammonium loss peak might be inappropriate for specific type glycans, like oligo-mannose glycans. Thus only neutral loss peaks of N-glycan core structure are matched in the software.
  3. The ion matching issue for precursor neutral loss and Y0 ion [CVVHYEJSTVPEKK (2+) (845.42 m/z)] have been fixed in updated version. I hope the above answers can help you.
yxzwang commented 1 year ago

Sorry that new bug comes for this update. image

Regarding the Y ions, 1 dalton often exceeds 20 ppm. Furthermore, the Y ion carries a peptide segment and the first monosaccharide HexNac (with N,H,O). Theoretically, deamination and dehydration should exist (although not necessarily connected with the loss peak of the fractured monosaccharide). From experimental observations, it appears that there are still many such ions present.

I understand that this may increase the possibility of mismatches. If you don't want to alter the existing results, could you provide me with a separate version that includes the loss peak of the Y ion at your convenience? You can contact me privately at 20110220128@fudan.edu.cn.

Thank you.

DICP-1809 commented 1 year ago

Sorry for the late reply. The error was caused by a missing property for precursor neutral loss peak, and it has be fixed.

  1. +1 Da is not the matter of mass tolerance. For Y ion, it contains an intact peptide and a fragmented glycan part, thus poses large mass value. Therefore, the mono-isotope Y peak is usually not the most abundant one in isotopic peak cluster (as the example in below picture) and even is low intensity in MS2 spectrum. +1/+2 Da isotopic peaks are also considered and matched for glycopeptide identification and ion annotation.

Snipaste_2023-05-29_10-31-57

  1. The nitrogen atom in HexNAc is at the linkage position and is not likely to be lost during fragmentation. In addition, for amino acid, only RKNQ can loss ammonium theoretically because of -NH2 group in their side chain (this is the reference). To avoid mis-matching of ions, I didn't consider Y neutral loss peaks in the software. Snipaste_2023-05-29_10-33-29 Snipaste_2023-05-29_10-44-24

Anyway, I also packaged a version of GP-Plotter to enumerate and match Y ions with neutral losses. The software has been sent to your personal email and please check it.