Open hsbyeon1 opened 5 months ago
The naming conventions in residue_names.py
follows the three letter codes defined by the PDB. The names used by Amber are nonstandard and conflict with the PDB definitions, so we use the PDB definitions in the Chemical Component dictionary for maximum compatibility with structural definitions.
I confess I'm not familiar with what ASH looks like myself and I cannot find it in the chemical component dictionary. is it something like ASP_LFZW?
Generally if you can define the residue name explicitly it in the topology you're providing during loading then you may be able to load it in fine? I haven't tested this much.
I'm +1 for keeping these internal lists on standard definitions
If there is a residue that is in the CCD but not in the internal lists (because it was obscure enough) then definitely happy to add it!
My trajectory contains ASH residues, which is protonated ASP for AMBER, albeit non-standard.
I guess
Toplogy.select('protein')
fails to parse atoms in such residues as protein, since__AMINO_ACID_CODES
frommdtraj/core/residue_names.py
lacks'ASH' : 'D'
, while it does contain non-standard notations such as 'GLH', 'HIH', etc...So I suggest adding
'ASH' : 'D'
if it does not make any problem.