dfermin / lucXor

JAVA-based implementation of LuciPHOr that can process any variable PTM.
1 stars 8 forks source link

0 PSMs for modeling #14

Closed EnriMassi closed 7 months ago

EnriMassi commented 7 months ago

Hi, when I try to use lucXor, I get a message that says I do not have enough PSMs for the modelling:

JarClassLoader: Warning: module-info.class in lib/jaxb-runtime-2.3.1.jar is hidden by lib/jaxb-api-2.3.1.jar (with different bytecode)
JarClassLoader: Warning: module-info.class in lib/txw2-2.3.1.jar is hidden by lib/jaxb-api-2.3.1.jar (with different bytecode)
JarClassLoader: Warning: module-info.class in lib/istack-commons-runtime-3.0.7.jar is hidden by lib/jaxb-api-2.3.1.jar (with different bytecode)
JarClassLoader: Warning: module-info.class in lib/stax-ex-1.8.jar is hidden by lib/jaxb-api-2.3.1.jar (with different bytecode)
JarClassLoader: Warning: module-info.class in lib/FastInfoset-1.2.15.jar is hidden by lib/jaxb-api-2.3.1.jar (with different bytecode)

luciphor2 (JAVA-based version of Luciphor)
Version: 1.2014Oct10
Original C++ version available at: http://luciphor.sf.net

Spectrum Path:           /public/conode55_pride/PRIDE_DATA/PXD003108/mzml
Spectrum Suffix:         mzml
Input file:              LuciPHOr2-PXD003108-peptides.tsv
Input type:              tsv
MS2 tolerance:           20.0 ppm
Luciphor Algorithm:      HCD
Classifying on:          Peptide Prophet Prob.
Run Mode:                Default
Num of Threads:          128
Modeling Threshold:      0.0
Scoring Threshold:       0.0
Permutation Limit:       16384.0
Max peptide length:      40
Min num PSMs for model:  50
Decoy Mass Adduct:       79.9663
Max Charge State:        4
Reduce NL:               no
Output File:             LuciPHOr2-results-PXD003108.tsv
Write matched Peaks:     no

Mods to score:
Y       79.9663
A       79.9663
T       79.9663
S       79.9663

Allowed Neutral Losses:
asty-H3PO4      -97.9769
<X>-H3PO4       -97.9769  (Decoy NL)

Reading PSM from TSV file: /public/compomics3/EnricoM/Paddy_ionbot_results/LuciPHOr2-PXD003108-peptides.tsv
Read in 21 PSMs

Reading spectra from /public/conode55_pride/PRIDE_DATA/PXD003108/mzml  (MZML format)
This can take a while so please be patient.

Running in HCD mode.

PSMs for modeling:
------------------
+2: 0 PSMs
+3: 0 PSMs
+4: 0 PSMs

You do not have enough PSMs with a score > 0.0 to accurately model the data. (Minimum number of PSMs required per charge state: 50)
Exiting now.

I tried to set the minimum score = 0, but nothing changed. I even tried to use the sample input from https://luciphor2.sourceforge.net/luciphorInfo.html

Thanks, Enrico

LuciPHOr2-PXD003108-config.txt LuciPHOr2-PXD003108-peptides.txt

dfermin commented 7 months ago

Hi I took a look at your input file LuciPHOr2-PXD003108-peptides.txt You specify that the scores are peptide prophet values and that seems incorrect. Peptide Prophet returns probabilities in the range of 0 to 1. Your values are > 1 so the program is skipping over them since it doesn't know what to do with them.

Reset you scores to be in the range of 0-1 and try agian. Also you appear to be phosphorylating Alanine???

EnriMassi commented 7 months ago

Thanks for the quick reply.

Does the "-log(expect)" option also work for q-values?

About alanines, We wanted to use phospho-alanines for FLR control. Do you think it will conflict with luciphor's own calculations?

Enrico

dfermin commented 7 months ago

I believe that it should work with the q-values (not tested it but the distribution should be the same) FLR control is built into LuciPHOR. The software adds a decoy mass to all of the non-STY amino acids. Adding phospho-alanine will mess it up.

EnriMassi commented 7 months ago

Hi, I tried q-values but it's still not working.

luciphor2 (JAVA-based version of Luciphor)
Version: 1.2014Oct10
Original C++ version available at: http://luciphor.sf.net

Spectrum Path:           /public/conode55_pride/PRIDE_DATA/PXD003108/mzml
Spectrum Suffix:         mzml
Input file:              LuciPHOr2-PXD003108-peptides.txt
Input type:              tsv
MS2 tolerance:           20.0 ppm
Luciphor Algorithm:      HCD
Classifying on:          -log(Expect Value) (X!Tandem or Comet)
Run Mode:                Default
Num of Threads:          128
Modeling Threshold:      0.01
Scoring Threshold:       0.1
Permutation Limit:       16384.0
Max peptide length:      40
Min num PSMs for model:  50
Decoy Mass Adduct:       79.9663
Max Charge State:        4
Reduce NL:               no
Output File:             LuciPHOr2-results-PXD003108.txt
Write matched Peaks:     no

Mods to score:
Y       79.9663
A       79.9663
T       79.9663
S       79.9663

Allowed Neutral Losses:
asty-H3PO4      -97.9769
<X>-H3PO4       -97.9769  (Decoy NL)

Reading PSM from TSV file: /public/compomics3/EnricoM/Paddy_ionbot_results/LuciPHOr2-PXD003108-peptides.txt
Read in 3108 PSMs

Reading spectra from /public/conode55_pride/PRIDE_DATA/PXD003108/mzml  (MZML format)
This can take a while so please be patient.

Running in HCD mode.

PSMs for modeling:
------------------
+2: 0 PSMs
+3: 0 PSMs
+4: 0 PSMs

You do not have enough PSMs with a score > 4.605170185988091 to accurately model the data. (Minimum number of PSMs required per charge state: 50)

LuciPHOr2-PXD003108-config.txt LuciPHOr2-PXD003108-peptides.txt

The data is already filtered for q-value <= 0.01, so it should read all PSMs.

dfermin commented 7 months ago

I notice your input files are mzml but in the LuciPHOr2-PXD003108-peptides.txt the srcFile field all end in mgf.

Try editing LuciPHOr2-PXD003108-peptides.txt so that the file names match your actual spectral files. So for instance H358_SILAC_CSC_pSTY_01.mgf becomes H358_SILAC_CSC_pSTY_01.mzml.

The file names have to match exactly because that's what the tool is going to try an open in the directory path you specified.

EnriMassi commented 7 months ago

Hi, yeah, I wanted to use mgfs at the beginning, but I got an error and switched to mzML...

may I ask you what could be the issue with the .mgf files? The spectra look like this:

BEGIN IONS
TITLE=controllerType=0 controllerNumber=1 scan=2
SCANS=2
RTINSECONDS=1.5527
PEPMASS=582.317993164063
CHARGE=2+
105.7636566 143.6292572021
128.6565552 126.7798385620
165.4815521 114.5321731567
165.8494110 117.2521438599
185.1653748 704.8305664063
186.1702576 135.4578247070
191.9755859 131.4096679688
194.6279297 111.7912216187
197.3030243 120.2407455444
213.1607513 133.8255767822
218.1471100 152.7518310547
257.4613953 128.7206420898
266.3388672 107.0072479248
268.2253723 128.4214782715
299.8312073 131.9399414063
355.0673523 148.0882873535
357.1749878 219.4304504395
428.9241333 120.0787811279
429.0904541 593.6565551758
453.7145081 148.2090148926
494.5896301 183.2062683105
512.0195923 163.4001922607
595.3117065 269.1739807129
621.5895386 126.4617691040
650.7680664 126.2585296631
752.7312012 119.8431777954
803.3770752 140.3620147705
930.3519897 161.3318634033
932.4123535 133.6003875732
951.4667358 207.4639434814
952.4846191 237.8884429932
1045.5076904 133.8008575439
END IONS

And I get this error:

Reading spectra from PRIDE_DATA/PXD003108/MGF  (MGF format)
This can take a while so please be patient.
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at com.simontuffs.onejar.Boot.run(Boot.java:340)
        at com.simontuffs.onejar.Boot.main(Boot.java:166)
Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end -1, length 48
        at java.base/java.lang.String.checkBoundsBeginEnd(String.java:4606)
        at java.base/java.lang.String.substring(String.java:2709)
        at lucxor.globals.read_mgf(globals.java:793)
        at lucxor.globals.read_in_spectra(globals.java:638)
        at lucxor.LucXor.main(LucXor.java:70)
        ... 6 more

Is it because it can't parse the spectrum title?

dfermin commented 7 months ago

I believe that it has to do with changes to the libraries I used to read in spectra. Back when this was first written the MGF files worked. Along the way that functionality broke but since most users feed in mzML or mzXML we didn't worry about it.

Sorry for the inconvenience.