Nesvilab / PTM-Shepherd

A tool for summarizing open search results
http://ptmshepherd.nesvilab.org
Apache License 2.0
14 stars 5 forks source link

java.lang.NullPointerException and other questions about output files. #8

Closed SimpleNumber closed 3 years ago

SimpleNumber commented 4 years ago

Dear all,

I have tried to use PTM-Shepherd for PTM localization and faced some problems. I have used PTM-Shepherd with almost default settings, except I've set output_extended = true because I need more information about localization. I took the file that you suggested here. And PTM-Shepherd crashed with the exception below:

PTM-Shepherd version 0.3.5(c) University of Michigan

Using Java 14.0.2 on 30688MB memory

Counting MS2 scans for dataset 02
        06_CPTAC_TMTS1-NCI7_P_JHUZ_20170509_LUMOS - 39517 scans
39517 MS2 scans present in dataset 02

Creating combined histogram
        Generated histogram file for dataset 02 [-504 - 505]
Created combined histogram!

Running peak picking

Picked top 500 peaks

created summary table

annotated summary table

Exception in thread "main" java.lang.NullPointerException
        at edu.umich.andykong.ptmshepherd.peakpicker.ModSummary.toFile(ModSummary.java:104)
        at edu.umich.andykong.ptmshepherd.PTMShepherd.main(PTMShepherd.java:364)

I have tried some other files and PTM-Shepherd worked with them. However, I still have several questions about its output:

  1. Can I somehow get the list of PSMs found in a specific mass shift peak? I have tried to take a slice from psm.tsv file using PeakLower and PeakUpper boundaries for a peak, which is provided in peaksummary.annotated.tsv. However, the slice size I get is usually smaller than the number of PSMs reported in peaksummary.annotated.tsv file.

  2. File global.locprofile.txt contains information about amino acid enrichment in each mass shift peak. It gives amino acid and two numbers, which of them is amino acid enrichment score? And what is the other number? Could you please also give a comment on how the enrichment score is calculated?

  3. "Localization score" is introduced in PTM-Shepherd article, however in the Results and Discussion section mostly "enrichment score" is used, which was not denoted. Are these scores the same?

Thank you in advance

danielgeiszler commented 4 years ago

Can you post the global.profile.tsv from the file that’s causing an issue?

  1. The loss is small, correct? If it is, it’s because of rounding and tie-breaking operations. There isn’t much that can be done about it at the moment. If you manually widen the peaks beyond what Shepherd reports you can recapture those, but that’s the only workaround I can suggest for now.

  2. The second number is the number of weighted PSMs attributable to that amino acid. Equally scoring positions are split between residues, hence the decimals. The first score is a localization enrichment score, roughly the odds of being localized to that particular amino acid when compared to random localization. It’s outlined in the manuscript under the localization methods.

  3. I hope 2. answers this. Thank you for noting the language inconsistency. We will rectify this.

On Sep 28, 2020, at 10:22 AM, SimpleNumber notifications@github.com wrote:  Dear all,

I have tried to use PTM-Shepherd for PTM localization and faced some problems. I have used PTM-Shepherd with almost default settings, except I've set output_extended = true because I need more information about localization. I took the file that you suggested here. And PTM-Shepherd crashed with the exception below:

PTM-Shepherd version 0.3.5(c) University of Michigan

Using Java 14.0.2 on 30688MB memory

Counting MS2 scans for dataset 02 06_CPTAC_TMTS1-NCI7_P_JHUZ_20170509_LUMOS - 39517 scans 39517 MS2 scans present in dataset 02

Creating combined histogram Generated histogram file for dataset 02 [-504 - 505] Created combined histogram!

Running peak picking

Picked top 500 peaks

created summary table

annotated summary table

Exception in thread "main" java.lang.NullPointerException at edu.umich.andykong.ptmshepherd.peakpicker.ModSummary.toFile(ModSummary.java:104) at edu.umich.andykong.ptmshepherd.PTMShepherd.main(PTMShepherd.java:364) I have tried some other files and PTM-Shepherd worked with them. However, I still have several questions about its output:

Can I somehow get the list of PSMs found in a specific mass shift peak? I have tried to take a slice from psm.tsv file using PeakLower and PeakUpper boundaries for a peak, which is provided in peaksummary.annotated.tsv. However, the slice size I get is usually smaller than the number of PSMs reported in peaksummary.annotated.tsv file.

File global.locprofile.txt contains information about amino acid enrichment in each mass shift peak. It gives amino acid and two numbers, which of them is amino acid enrichment score? And what is the other number? Could you please also give a comment on how the enrichment score is calculated?

"Localization score" is introduced in PTM-Shepherd article, however in the Results and Discussion section mostly "enrichment score" is used, which was not denoted. Are these scores the same?

Thank you in advance

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

SimpleNumber commented 4 years ago

Can you post the global.profile.tsv from the file that’s causing an issue?

global.profile.tsv is an output filer from PTM-Shepherd, isn't it? Since PTM-Shepherd crashed on 06_CPTAC_TMTS1-NCI7_P_JHUZ_20170509_LUMOS file, I don't have it.

  1. Yes, it is about 10-50 PSMs.

Thank you for the prompt response :)

danielgeiszler commented 4 years ago

Sorry, you're right. You should have a file called peaksummary.annotated.tsv or something along those lines. Shepherd produces files in a stepwise fashion where it reads from previous results. The final file is global.profile.tsv, but the quantification values should be in that intermediate file. If you have one of those, I can take a look.

Are you running them in the same directory? Are files being overwritten somehow? It's a strange spot for it to be crashing, so I might need the MS file in question if not.