Open singjc opened 2 days ago
Hi Justin. If you use the provided ProForma parsers you will end up with a modification of that specified mass. Changing this mass modification to a database modification (Oxidation from Unimod) is something I have some internal code for (also used for reading of identified peptides files) so I could make that function public. Doing it this way will have a very minor effect on the peptide though as the mass is close enough to the monoisotopic mass that most functions will already work fine.
If you are interested here is the current function, I am thinking of adding the tolerance as a parameter and then making it public.
/// Look at the provided modifications and see if they match any modification on this peptide with
/// more information and replace those. Replaces any mass modification within 0.1 Da or any precise
/// matching formula with the provided modifications.
pub(crate) fn inject_modifications(&mut self, modifications: &[SimpleModification]) {}
Hi Douwe,
Thank you for the info, the LinearPeptide
ProForma parser works great!
I think it would be great if you add a tolerance and make the inject_modifications
method public, that would be useful!
I played around with the LinearPeptide parser to get the desired output I want. Not sure if the code is optimal or could be written better, but it seems to work.
fn get_modification_name_site(
mod_search: ModificationSearchResult,
modified_aa: AminoAcid,
) -> Option<String> {
match mod_search {
ModificationSearchResult::Mass(_, _, matches) => {
matches
.into_iter()
.find_map(
|(ontology, _, _, modification)| match (ontology, modification) {
(
Ontology::Unimod,
Modification::Predefined(_, specificities, _, psims_name, _),
) => specificities.iter().find_map(|specificity| {
if let PlacementRule::AminoAcid(amino_acids, _) = specificity {
if amino_acids.contains(&modified_aa) {
Some(format!("{}@{}", psims_name, modified_aa.char()))
} else {
None
}
} else {
None
}
}),
_ => None,
},
)
}
_ => None,
}
}
fn get_all_modifications(peptide: &LinearPeptide, tol_ppm: f64) -> String {
let tol = Tolerance::new_ppm(tol_ppm);
peptide
.sequence
.iter()
.enumerate()
.filter_map(|(_, sequence_element)| {
sequence_element.modifications.first().and_then(|mod_mass| {
let modified_aa = sequence_element.aminoacid;
let mod_search = Modification::search(mod_mass, tol);
get_modification_name_site(mod_search, modified_aa)
})
})
.collect::<Vec<String>>()
.join(";")
}
Use
let peptide_str = "MSFNELT[79.9663]ESNKKSLM[+15.9949]E";
let peptide = LinearPeptide::pro_forma(peptide_str).unwrap();
let result = get_all_modifications(&peptide, 1.0);
println!(
"Found the following modifications {} for {}",
result, peptide_str
);
output
Found the following modifications Phospho@T;Oxidation@M for MSFNELT[79.9663]ESNKKSLM[+15.9949]E
Hi,
I am interested in using the rustyms library to extract information from modified peptides strings. If I have the following modified peptide seuqence (with a mass shift): "KDM[+15.9949]YGLQAEME". Would it be possible read this string and then derive the type of modfication and the amino acid, "Oxidation@M"?
Best,
Justin