Open joaomcteixeira opened 1 year ago
I thought the CCD was different from PDB ID? However I read that PDB IDs will still be affected in the future: https://www.wwpdb.org/news/news?year=2021#607760112786e73a79c76f9d
Are we thinking of making updates to libpdb
? We would need an example of a culled list of 5 char PDB IDs to test out pdbdl
and our Structure
class. Thoughts?
PRD_999999 1 1 THR THR N Y
PRD_999999 1 2 X2AVD VAL N Y
PRD_999999 1 3 PRO PRO N Y
PRD_999999 1 4 SAR GLY N Y
PRD_999999 1 5 MVA VAL N Y
PRD_999999 1 6 PXZ ? N Y
PRD_999999 1 7 THR THR N Y
PRD_999999 1 8 X2AVD VAL N Y
PRD_999999 1 9 PRO PRO N Y
PRD_999999 1 10 SAR GLY N Y
PRD_999999 1 11 MVA VAL N Y
Aha, I see in this example of a 5 character CCD of what was DVA
now it's X2AVD
. I was looking at our libstructure
and libcif
on line 333 and currently we organize our columns for mmCIF using _atom_site.XYZ
(for models in wwPDB/extended-wwPDB-identifier-examples/tree/main so it should still be okay?
Yes I did some testing and we still recognise these residues and process these new mmCIF files no problem, just having trouble converting them to their one letter code constituents... but then again, I was thinking for phosphorylation PTM at least we'll have the lower-case sequence (e.g. phospho-serine could be s
instead of S
). Since phosphorylation is a very common PTM and I need to have it as a 1 letter code to perform sequence electrostatic potential analysis (in my new functions 😉)
Edit: the lower case idea isn't very good, I'm considering on just making different flags where the user specifies what residues are phosphorylated (and other PTMs as we move forward)
Good to hear. Good luck coming up with 1-letter PTM codes. Happy to brainstorm if you want.... Thanks. Julie
From: Zi Hao (Nemo) Liu @.> Sent: December 14, 2023 2:55 PM To: julie-forman-kay-lab/IDPConformerGenerator @.> Cc: Subscribed @.***> Subject: Re: [julie-forman-kay-lab/IDPConformerGenerator] New 5 chars PDB IDs (Issue #250)
Yes I did some testing and we still recognise these residues and process these new mmCIF files no problem, just having trouble converting them to their one letter code constituents... but then again, I was thinking for phosphorylation PTM at least we'll have the lower-case sequence (e.g. phospho-serine could be s instead of S). Since phosphorylation is a very common PTM and I need to have it as a 1 letter code to perform sequence electrostatic potential analysis (in my new functions 😉)
— Reply to this email directly, view it on GitHubhttps://github.com/julie-forman-kay-lab/IDPConformerGenerator/issues/250#issuecomment-1856491495, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMXWN4GIRIY3LUYJABQ4JFDYJNKRJAVCNFSM6AAAAAA46BIK7GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJWGQ4TCNBZGU. You are receiving this because you are subscribed to this thread.
This e-mail may contain confidential, personal and/or health information(information which may be subject to legal restrictions on use, retention and/or disclosure) for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this e-mail in error, please contact the sender and delete all copies.
Yes I did some testing and we still recognise these residues and process these new mmCIF files no problem,
Yes, the parser looks for spaces.
For the new pdb_00001abc
codes we likely need to update the code below so that IDPCG recognises these new codes in the culled list or in general.
Another point of possible error is the DSSP calculation. I don't remember how the PDBIDs are handled there.
To-do list:
libpdb
aa3to1
can be extended if needed.
This affects us:
https://www.wwpdb.org/news/news?year=2023#63ff72ccc031758bf1c30ff7