vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
262 stars 53 forks source link

Decoy generation #664

Open Chen-micslab opened 1 year ago

Chen-micslab commented 1 year ago

Hi, I have a question about the decoy generation. The method in your paper is "By default, this is done by replacing the fragment ion m/z values of the target precursor assuming the amino acids adjacent to the peptide termini were mutated (GAVLIFMPWSCTYHKRQEND to LLLVVLLLLTSSSSLLNDQE mutation pattern is used)". I'm sorry, this is a bit difficult for me to understand.

Chen-micslab commented 1 year ago

Hello Vladim, I would like to elaborate on my question indetail. In my recent experiment, I attempted some new modifications that are not included in DIA-NN. In order to distinguish from conventional peptide sequences, I add the characters U and X to the peptide sequence to represent modifications (although U and X represent non natural amino acids in DIA-NN, we only studied 20 natural amino acids). The precursor m/z of peptide sequences containing U and X in the spectrogram library was calculated based on the mass of these modifications, which does not match the theoretical mass of U and X built into DIA-NN. We thought that searching the library in this way would result in an error, but unexpectedly, the search was successful and there were even no issues with the generation and FDR evaluation of the decoy. And in order to facilitate the search of the library, I deleted the column of fragment type information because I was worried that DIA-NN would automatically calculate the theoretical mass of fragment ions based on peptide sequences and fragment types, which would not match their actual mass. We only retained eight columns of information in the spectrum library: peptide sequence, precursor charge, precursor m/z, retention time, ion mobility, fragment ion charge, fragment ion m/z and fragment ion intensity. Here are my questions:

  1. When searching the library, DIA-NN performs analysis based on the precursor m/z information provided by us in the spectrum library, rather than automatically calculating the precursor m/z based on the peptide sequence in the library?

  2. How DIA-NN generate a decoy peptide with the same mass as the target peptide when the theoretical mass and precursor m/z of the peptide sequence are not equal in the DIA-NN spectrum library?

Thank you!!

vdemichev commented 1 year ago

The conventional way of adding mods is by using --fixed-mod or --var-mod. Please note that you can manipulate libraries in DIA-NN in different ways, like in silico labelling peptides in them or merging different .tsv libraries.

For the purpose of decoy generation U and X are treated as C and M, respectively. Meaning, mutated to S and L. The delta masses are caclulated using the canonical masses of U and X. But whatever you denote with this AAs, even if this make decoy generation behave unexpectedly, this unlikely to cause any signficant effect on the analysis results.

If fragments are properly annotated, DIA-NN does not infer any masses, unless Smart profiling or Full profiling modes are used. So it's always good to have them properly annotated.

Best, Vadim

Chen-micslab commented 1 year ago

Thanks for your reply. I still don't understand the decoy generation method in your paper "By default, this is done by replacing the fragment ion m/z values of the target precursor assuming the amino acids adjacent to the peptide termini were mutated (GAVLIFMPWSCTYHKRQEND to LLLVVLLLLTSSSSLLNDQE mutation pattern is used)". The mass of GAVLIFMPWSCTYHKRQEND is not equal to LLLVVLLLLTSSSSLLNDQE, shouldn't the mass of target and decoy be the same?

vdemichev commented 1 year ago

Well, it works well like this.

Chen-micslab commented 1 year ago

But I tried to manually add some decoy peptides to the spectrum library. If the precursor m/z of these decoy peptides is not the same as that of some target peptides, DIA-NN will report an error indicating that the decoy peptide needs to match the m/z of the target.

vdemichev commented 1 year ago

Yes, precursor m/z should be the same.

Chen-micslab commented 1 year ago

If the target peptide mutate as GAVLIFMPWSCTYHKRQEND to LLLVVLLLLTSSSSLLNDQE, the precursor m/z of decoy is no same as the target. This is where I'm confused。

vdemichev commented 1 year ago

Only fragments masses are adjusted

Chen-micslab commented 1 year ago

Thank you very much. Another question, in your paper "replacing the fragment ion m/z values of the target precursor assuming the amino acids adjacent to the peptide termini were mutated ". Is only the AA in the peptide temini mutated? or all the AAs mutated?

vdemichev commented 1 year ago

"the amino acids adjacent to the peptide termini were mutated"

vdemichev commented 1 year ago

That is AAs next to peptide termini.

Chen-micslab commented 1 year ago

May I ask how many amino acids adjacent to the termini have been mutated?>_<

vdemichev commented 1 year ago

In peptide PEPTIDE the second E and 6-th D are mutated, nothing else

Chen-micslab commented 1 year ago

Thank you for your reply, it was very helpful!