Matching between runs with a library set does not seem to add much? (not an issue)

jadv12 commented 3 years ago

Hi there,

I just want to say first that this is not an issue or bug. Just hoping to get some thoughts/opinions.

I have been following the MaxQuant group's method of incorporating a subset of depleted and fractionated samples as a means to counteract the high missingness in plasma proteomics using the MBR setting. My proposed experiment would include a total of 226 samples:

200 experimental samples
16 depleted fractions (library runs)
10 technical replicates

I was hoping replicate this boost in complete data with Fragpipe (which has been amazing at identifying more total proteins AND in a much faster manner).

Now, my runs have been getting far less proteins and peptides per individual sample than MQ. I've been searching 3 replicates with and without the library runs (8 fractions). All with min. ions of 1 (due to low number of samples) and FDR of 1%. I was also curious to see what parameter optimization does as well as the "lib" functionality when you put lib in the experiment name (saw this in https://github.com/Nesvilab/FragPipe/issues/343#issuecomment-813696409).

Summary of quantified proteins and peptides (by MaxLFQ) for the different parameters

I found it interesting that adding the library runs reduced the number of quantified proteins and peptides. Even more so when using the "lib" experiment names. Maybe I'm misinterpreting the data?

jadv12 commented 3 years ago

PNGs when searching the replicates only

20210901_timsTOF_Evo_22min_CDS_VanJ_S1_S3-B9_1_509_model

PNG when searching the replicates with the library runs 20210901_timsTOF_Evo_22min_CDS_VanJ_S1_S3-B9_1_509_model

Maybe I need to change the parameters such that more target ions score > 0? If that is possible?

fcyu commented 3 years ago

In https://github.com/Nesvilab/FragPipe/issues/504 , we just convinced you that FragPipe quantified more proteins with 100% MBR-FDR (which is equivalent to what MaxQuant has). You also agreed that with 100% MBR-FDR, there are many false positives, and should use 1-5% MBR-FDR.

As to your question "I found it interesting that adding the library runs reduced the number of quantified proteins and peptides. Even more so when using the "lib" experiment names. Maybe I'm misinterpreting the data?"

There is no guarantee that more runs will result in more quantified proteins and peptides if you have FDR control. Same logic as bigger database won't always result in more identified peptides.

Best,

Fengchao

Nesvilab / FragPipe

Matching between runs with a library set does not seem to add much? (not an issue) #525