Closed Guitou1 closed 10 months ago
The output file should contain all PSMs - there is no filtering applied for FDR or q-values.
In the case of PSMs, q-values are directly calculated via counting decoys (1 + decoys)/targets
after sorting by discriminant score. Peptide and protein q-values are calculated via summation of PEPs derived from a non-parametric estimate of discriminant score distribution (kernel density estimate) - this is done to enable the picked-FDR approaches. So filtering does not impact discriminant scores, since discriminant scores are calculated before FDR estimation.
For longest y-ion series, the reported value is the longest chain of consecutive y-ions annotated in a spectrum (not the number of y-ions). For instance if you had y1 y2 _ y4 y5 y6 y7
, then longest_y = 4
. Are you looking for an output to be added for total # of annotated y-ions?
I have run MS2Rescore successfully - let me see if I can figure out what I did and report back.
Thank you for the fast response !
I am indeed looking for an output reflecting the total number of annotated y-ions which I think can be pretty insightful when plotting spectra (or when comparing single-cell data with bulk data)
I would be so grateful if you could share your MS2Rescore methodology (I have also put an issue up on MS2Rescore's GitHub page and I will report back here if I obtain an answer)
Once again, many thanks for the quick response !
I am currently reviewing a PR for building spectral libraries/writing all annotated peaks to a file, so hopefully that will help with the first point 😄.
I think you need to install the beta version of ms2rescore from source - I believe the version on pip doesn't support Sage results yet
Great ! Good luck with that and thanks, I will try to do just that :)
For anyone wondering:
git clone *GitHub code clone link*
-cd ms2rescore
-pip install .
and apply: ms2rescore -p results.sage.tsv -s data/mzML/ --psm-file-type 'sage'
Hi,
I'm currently working with Single-Cell data and I wanted to observe the FDR distribution of the PSMs obtained, but Sage's output mainly includes q-values of around 1%. Is there any way Sage could report PSMs at higher q-values ? This shouldn't have an impact on its discriminant score, does it ?
As a side question, does the longest y-ion series reported by Sage correspond to the consecutive y-ions annotated in a spectrum or simply the total number of annotated y-ions in the spectrum ? Is there a way Sage could report these annotations, and how are they calculated (couldn't find it in the code on GitHub...) ?
I'm afraid I lack skills in Rust to confidently apply changes.
On a sidenote, I have tried running MS2Rescore (a rescoring tool claiming to support sage.tsv outputs) without success in the Command Line Interface. Has anyone tried it successfully ? If so, could you share the method you've used ?
Thank you in advance !