compomics / moFF

A modest Feature Finder (moFF) to extract MS1 intensities from Thermo raw file
Apache License 2.0
33 stars 11 forks source link

sumIntensity or maxIntensity on the peptide summary file #44

Closed yafeng closed 5 years ago

yafeng commented 5 years ago

Hi, I have a question on the peptide summary file. For this peptide, TVAAPSVFIFPPSDEQLKSGTASVVC+57.021LLNNFYPREAK It was identified at different charge statea, and some with isotope error. In the peptide summary output, a sum intensity over all PSMs was calculated. In my cases, it is questionable. The first two rt_peaks are identical, they should not be summed. I would prefer to take the maximum MS1 intensity among all PSMs to represent this peptide, regardless the charge state and isotope peaks. Does this make sense? And last, is there an option to use maximum instead of sum to generate the peptide summary file?

mz IsotopeError charge rt mass mod_peptide intensity rt_peak lwhm rwhm 5p_noise 10p_noise SNR log_L_R log_int
1014.026 0 4 5910.892 4052.104 TVAAPSVFIFPPSDEQLKSGTASVVC+57.021LLNNFYPREAK 1085156.5 5889.0442 5884.6632 5893.2268 13.5964 20.62445 98.04136887 -0.002133635 20.04947169
1014.024 0 4 5876.8381 4052.096 TVAAPSVFIFPPSDEQLKSGTASVVC+57.021LLNNFYPREAK 1085156.5 5889.0442 5884.6632 5893.2268 17.5945275 21.537055 95.80229536 -0.002133426 20.04947169
1014.281 1 4 5952.1358 4053.124 TVAAPSVFIFPPSDEQLKSGTASVVC+57.021LLNNFYPREAK 2354168.75 5902.0677 5893.2268 5902.7362 32.753998 55.194596 97.13146545 -0.002365872 21.16678631
1351.698 0 3 5889.8459 4052.094 TVAAPSVFIFPPSDEQLKSGTASVVC+57.021LLNNFYPREAK 85351.9375 5897.7405 5893.2268 5918.7709 5.3747655 6.116244 84.01707727 -0.006345292 16.38113628
1014.024 0 4 5955.5373 4052.096 TVAAPSVFIFPPSDEQLKSGTASVVC+57.021LLNNFYPREAK 1068064.25 5897.7405 5893.2268 5912.5174 15.9713205 21.95456 96.50513108 -0.004795371 20.02656701
1352.027 1 3 5919.6964 4053.081 TVAAPSVFIFPPSDEQLKSGTASVVC+57.021LLNNFYPREAK 191413.7031 5902.0677 5893.2268 5905.5918 3.635983 4.88871 94.42702363 -0.003075281 17.54633459
1014.276 1 4 6013.3796 4053.104 TVAAPSVFIFPPSDEQLKSGTASVVC+57.021LLNNFYPREAK 40233.30078 6016.0233 5993.6061 6029.4777 23.064722 44.635451 64.83274879 -0.008754719 15.29610248
Maux82 commented 5 years ago

Hi,

When I create the peptide summary, first I sort all the PSM by RT, and then drop all duplicates of the same peptides. In this way I am sure to consider only the peak intesity of the MS2 event that early eludes for each peptide. I don ' t know (or I haven't done before ) if taking the max insted of the sum can give you any kind of advantages. There is no option to select the maximum insted of the sum of the intensity. If you want I can point out in the code where you can set to get the maximum. In this way you also write your custom aggreation schema for the summary file .

yafeng commented 5 years ago

For the two PSMs with rt_peak = 5889.0442, it makes sense to drop one of them. For rt_peak = 5897.7405 and rt_peak = 5902.0677, they both have two PSMs with different MS1 intensity. do you also sum them? does the sumIntensity in the peptide summary represent apex intensity or it has a different meaning?

I looked up in the peptide summary file, this peptide has sumIntensity of 4841480.69 It appears that only the peptide with rt_peak = 5897.7405 and MS1 intensity = 1068064.25 was dropped to get the sum.