Open francisacquah466 opened 1 year ago
Hey there! Sorry for the late reply ... RN the implementation of the smiles generator is fairly simple (https://github.com/jspaezp/Peptides/blob/b0aab3765f99a0c4c79dddfecdd12d3ff71c9a20/R/smilesStrings.R) and I think it could be extended to 3-letter aas, but since the 3 letter abbreviation is not supported in any other part of the package (that I can recall) I would feel very inconsistent ...
Maybe something like this would work for you (I have not tested it but I feel like it would work ...):
three_letter_aaSMILES <- function(seq) {
aminoacid_smiles <- c(
"Ala" = "N[C@@]([H])(C)C(=O)O",
... # All other amino-acids added here
"Val" = "N[C@@]([H])(C(C)C)C(=O)O")
# split_sequences <- strsplit(toupper(seq), "")
split_sequences <- lapply(seq, function(x) gsub("(.{5})", "\\1 ", x))
smiles_aa_sequences <- lapply(split_sequences, function(x) aminoacid_smiles[x])
# This trims the last O in the -OH in the carbonyl in each aminoacid
concat_aa_smiles <- lapply(
smiles_aa_sequences,
function(x) paste(gsub("O$", "", x), collapse = ""))
concat_aa_smiles <- lapply(concat_aa_smiles, function(x) paste0(x, "O"))
concat_aa_smiles <- unlist(concat_aa_smiles)
return(concat_aa_smiles)
}
Hi @dosorio
Thanks for such a wonderful package.
I'm working to generate lot of peptides mostly with non-natural amino acids. I was wondering if there is a possibility of expanding the list of amino acids to include new amino acids and their SMILES. So that for the aaSMILES function peptides with non-naturals to be pass into to generate SMILES for them. I envisage a situation where one letter amino acid name may be problematic. Is there a way to this can be added. Maybe by using the 3-letter amino acid code rather than the 1 letter code.
It would help a lot!
Thanks!