smith-chem-wisc / MetaMorpheus

Proteomics search software with integrated calibration, PTM discovery, bottom-up, top-down and LFQ capabilities
MIT License
90 stars 45 forks source link

MetaMorpheus/FastLFQ algorithms #1597

Closed KanshinED1 closed 5 years ago

KanshinED1 commented 5 years ago

Hello,

I have several questions about FlashLFQ. Just read the original FlashLFQ publication and it is great, but I have several questions about the algorithm:

  1. in the publication you use only peptide intensities (which makes sense), but in the MetaMorpheus there is also option to get LFQ Intensities for proteins (which would be the most important for protein-level analyses). Could you explain how you get these intensities (as a simple sum of all identified peptides? do you consider only unique peptides for these calculations?)

  2. There is an option to match between runs and you can set up the mass tolerance for this (I assume this happens on recalibrated data). What is the default retention time tolerance for this and is there an option to change it?

  3. In the search parameters, I saw a "lowCID" fragmentation. If I have low-resolution MS2 data, do you calibrate it in the same way as orbitrap MS2 (both MS1 and MS2 scans)?

  4. This one is really stupid, how do I access "Conserve memory (might be slow)" option? I was not able to find it in the app (v. 0.0.297).

  5. Experimental design and LFQ. How you perform normalization of the intensity across files (shift to the same mean/median intensities, for proteins or for peptides?).

I know I ask too many questions but it is because MetaMorpheus seems to be a great tool and I'm really looking forward to using it more in our lab.

trishorts commented 5 years ago

Thanks for you interest. I'll try and give you and answer to your questions. Hopefully, one of the other students will provide additional insight when they check their work computers.

  1. Currently MM sums the intensities of the peptides associated with the proteins following the parsimony step. We are working on implementing a Diffacto type strategy, but that is some weeks out yet.
  2. There is a tolerance. I recommend leaving the defaults determined through calibration unless you have some specific reason for changing them. I believe that the time tolerance is ~2 minutes. But, the time is adjusted for additional variations between runs. So, if one run is "slower" than another, MM can deal with that. There will be a new update to FlashLFQ match between runs shortly. We'll announce it. Keep on the look out for a new release of MM, which will contain it.
  3. MM calibrates the MS1s separately from the MS2s and they should both get calibrated regardless of the resolution.
  4. The conserve memory feature has been deprecated. I don't know why, but I'll check.
  5. The normalization is based on the assumption that most proteins abundance doesn't change between conditions and that only a small subset do. Proteins intensity is from summed peptide intensity across fractions and bioreps and conditons. We minimize protein intensity variation by finding normalization factors for each run.

We're happy to answer any and all questions you might have. You can also email us for support if you like. mm_support@chem.wisc.edu

KanshinED1 commented 5 years ago

Thank you, looking forward to the updated version with Diffacto)