smith-chem-wisc / MetaMorpheus

Proteomics search software with integrated calibration, PTM discovery, bottom-up, top-down and LFQ capabilities
MIT License
90 stars 46 forks source link

Higher intensity experimental peaks aren't prioritized #1197

Open zrolfs opened 6 years ago

zrolfs commented 6 years ago

For two experimental peaks that are close in mass, the lower m/z peak is matched to the theoretical peak when we should instead be prioritizing the higher intensity peak (or perhaps the one closest in mass?) This is a significant problem in low res MS2 data.

Example: Product tolerance = 0.01 Da. Theoretical peak: 500.010 Da Experimental peaks: 500.005 Da intensity: 1e4 500.010 Da intensity: 1e7

The 1e4 peak is currently being used for score calculation.

trishorts commented 6 years ago

I wonder if we run into issues with coisolation. Also, any chance the order of the peaks gets scrambled?

zrolfs commented 6 years ago

I'm running into this again for spectral comparisons. It'd be nice to have an elegant solution that doesn't triple the comparison time...

zrolfs commented 6 years ago

Does anybody have an opinion on if we should prioritize the highest intensity or the closest mass? From the previous example, if there was an additional experimental peak (500.020 Da intensity: 2e7), which should we count? I am leaning the closest mass, since it should be less computationally demanding and I have a solution for it already...

rmillikin commented 6 years ago

I think highest intensity would be best, since closest mass is more liable to pick noise. I don't feel super strongly about it though. Needs evidence either way... RT vs predicted RT, for example

rmillikin commented 6 years ago

closest mass is used now, btw, not lowest m/z that meets the mass tolerance