pFindStudio / pFind3

24 stars 7 forks source link

questions on pFind score distributions #40

Open azelter opened 4 years ago

azelter commented 4 years ago

Dear pFind team,

I have downloaded and run pFind and I have some questions. There are 3 main scores in your pFind.spectra file. 1) Raw score; 2) Final Score; 3) Q value. Which score should I use and what threshold should I apply for 1% FDR? Your q values go well above 1, so the standard of q<=0.01 does not seem to make sense here.

In your publication supplemental fig 2b you have a plot of target/decoy score distributions. The horizontal axis shows score from 0 to 1 and the total number of target and decoy PSMs across all scores is similar. This looks like what I would expect. sup2b

In your pFind3 userguide pdf on page 18 you have a similar plot but here there are virtually no decoys and the horizontal axis goes to ~9. userguide-plot

The score distribution plots done by pBuild of my own data look like the plots from your pFind3 userguide. -log(Score) goes up to almost 5 and decoy PSMs are almost totally absent. myData-pBuild-plot

If I plot these same data myself, I first have to remove lines with Q values of 512 and 1024. I don't understand what these lines are. I then get a plot that has a similar number of target and decoy PSMs, but q values go up to 3.2, which again I don't understand. My own plot looks closer to the one in your supplemental figure 2b but there are decoys spread more evenly throughout the range of q values, which seems strange. myData-my-plot

q values in my data range from 0 to 3.2643 plus some values at 512 and 1024 Raw score in these data are from close to 0 to 5.226377 Final score is from close to 0 to 1

Please could you explain this scoring? Thank you for your help, Alex