Closed JinyuanSun closed 1 year ago
The suffixes refer to different DNA sequences encoding the same protein sequence, all included in the same experiment. Not sure about the duplicates of the ".pdb" (no suffix) line - possibly different experiments. @KotaroTsuboyama can clarify and also point you to the DNA that goes with each suffix.
"1GYZ.pdb" represents WT amino acid seq. And in this case, the amino acid sequence was measured in multiple libraries. But the suffixes like _wtm, _wte represent the same amino acids with different DNA sequences as Gabe pointed out; in reverse translation we use codon table optmized for E coli (because we use e coli-based translation system), but to get different DNA sequence with the same amino acids, we intentionally utilized different codon tables (m is mouse, h is human, y is yeast, and e is different version of ecoli table). I hope it makes sense to you. If you have further questions, please let us know!
In the
Processed_K50_dG_datasets/K50_dG_Dataset1_Dataset2.csv
, wildtype can be found with multiple measured ddG, and have suffixes like_wtm
,_wte
... What does these suffixes mean? Are they just different runs of the same experiment? For example:Also, if I'd like to calculate ddG = dG_mut - dG_wild, can I take the average of different deltaG values of the same sequence?