compomics / moFF

A modest Feature Finder (moFF) to extract MS1 intensities from Thermo raw file
Apache License 2.0
33 stars 11 forks source link

Question about PTM file #47

Open jblakele opened 5 years ago

jblakele commented 5 years ago

Hi,

How important is the ptm_setting_mq.json file for match filter after matched between runs? The reason I ask is that it is required, but the search algorithm I use denotes the modification differently than maxquant or peptide shaker. Do I need to create a new ptm_setting file then or does it not matter? My match between runs result I just acquired had matched modified peptides that made it through the threshold and I used the ptm_setting_mq.json file, which is not directly applicable to the search algorithm I'm using. Thank you for the clarification.

Best Regards, Alfredo

Maux82 commented 5 years ago

Hi, the PTM setting is really important for the filtering of the mathe matched peptides. Unfortunatly, there is a not standard way to parse the modification because every search engines annote them in a different ways. I am aware that how it is implemented now in moFF is not the most flexible for the user. Which search engine are you using ? Both fixed and variable modification are in the mod_peptide sequence or not ? if you give this information I could help you to set up the ptm file.

jblakele commented 5 years ago

Hi, Thank you for getting back to me. I am using tide+percolator within the crux framework. I am only using Carbamidomethylation of C and Oxidation of M. Since C is a fixed mutation, the output does not denote it it is just assumed. Oxidation is denoted as so [15.99]. I created a new PTM_settings file like this.

{ "C": {"deltaChem":[3,2,1,1],"desc":"Carboxyamidomethylation C unimod:4"}, "[15.99]": {"deltaChem":[0,0,0,1],"desc":"oxidation oxidation unimod:35" }}

Is this correct?

On another note. Is there a reference for how specific the filtering step is to a specific peptide? I've read the update in Journal of Proteome Research and it looks pretty good; however, the reason I ask is that I am trying to quantify metaproteome data, were the proteomes between samples can be quite different so matching between runs is quite nerve racking. That being said I have observed that much of the discrepancy between samples seems to be driven by the complexity which leads to the MS2 not always being triggered for every peptide. So it is a catch 22.

Best Regards, Alfredo

Maux82 commented 5 years ago

{ "C": {"deltaChem":[3,2,1,1],"desc":"Carboxyamidomethylation C unimod:4"}, "[15.99]": {"deltaChem":[0,0,0,1],"desc":"oxidation oxidation unimod:35" }}

This shoudl works fine, but he problem is that moFF also expects also an openend tag for modifiction '< > '. You should add them in mod_seq information. However I can also point out that part in the code and you can modfiy it for your needs.

On another note. Is there a reference for how specific the filtering step is to a specific peptide?

as reference you mean other papers ? The filtering steps are mainly based on the test that I have done on complex samples (not meta proteome). Of course they do not cover all the possible cases. For example sthe number of Isotopic peak is fixed to 3, but if you change to 2 the filter is less stringent on th other hand if you takeall the possible isotopic peaks becomes more stringent. So far the number of isotope is not a parameter, but we can insert it a next version.
My first though about metaproteomics and mbr is try to match only the unique peptides for every proteins in order to matching less peptides but only the one that are a good signature for every proteins. However, if you find moFF usefull for quantify metaproteomics data I will be glad.

jblakele commented 5 years ago

Hmm I'm a little confused about the need for "<>" the maxquant settings example doesn't have those. Are you saying I can use the ptm_settings described above, but I need to modify my modifications to have <>. Such as, example.

image

I've run a test. There are modified peptides that were matched and passed the filter without the <> symbol. Are those matches suspect?

I'm pretty pleased with the flexibility of moFF for quantification. Here's to hoping that this is my final build and a quantification citation for metaproteomics is coming your way soon.

I agree about unique peptides. I include the shared peptides in my intermediate results, but the final result is really driven by the unique peptides.

Maux82 commented 5 years ago

Hmm I'm a little confused about the need for "<>" the maxquant settings example doesn't have those. Are you saying I can use the ptm_settings described above, but I need to modify my modifications to have <>. Such as, example.

image

I've run a test. There are modified peptides that were matched and passed the filter without the <> symbol. Are those matches suspect?

Basically moFF accept 'as default' the modification format from searchGui/PeptideShaker and they use < > (eg. O etc.. see ptm_setting_ps.json) to delimit the modification in the sequence. This is reason why I suggest to use '<' because if they are no present the modification is not taken into account. In your case , you should try something like and <[15.99]>.

For internal use (in our proteomics facility) I allow also the use of Maxquant output , so this this reason why you have also the ptm_setting_mq.json but in this case there is a flag hard coded see https://github.com/compomics/moFF/blob/4af3c2abae4209e59d7d83b4c86305dae0b86a48/moff.py#L908)