Closed bienchen closed 2 years ago
We do something similar in python-ihm for cross-linkers, which are often not in CCD either - the ihm.ChemDescriptor class. I think inheriting from ihm.ChemComp
is potentially problematic as the ChemComp
class hierarchy is already used to distinguish DNA/RNA/L-peptide/D-peptide/water/non-polymer. So you'd have to add a mixin class or assume that such custom components are always a single type, e.g. non-polymers.
A solution that avoids that problem would be to just have ChemComp
take an extra descriptors
argument, a list of Descriptor
objects. python-ihm wouldn't define any (since the dictionary doesn't support that) but python-modelcif could. Then you'd just write out chem_comp.ma_provenance "CCD local"
if descriptors
is non-empty. (This would also allow multiple descriptors, e.g. inchi plus smiles.) Something like
class ChemComp(object):
def __init__(self, id, code, code_canonical, name=None, formula=None, descriptors=None):
...
class Descriptor(object):
pass
class InChIKeyDescriptor(Descriptor):
type = "InChI Key"
def __init__(self, value, details=None, software=None):
...
delamanid = ihm.NonPolymerChemComp(
id=..., name="Delamanid",
descriptors=[modelcif.InChIKeyDescriptor(value="XDAOLTSRNUSPPH-XMMPIXPASA-N")])
Looks like a good idea to me. Simply keep all the unknown ligand's info nicely together. If we manage to find out what to do about PDB format CONECT
records, I guess they could be handled in a similar way.
I tested the new feature and it works as expected. Compounds get marked as "local" and get annotated their list of descriptors. Thanks a lot.
Hello,
in ModelArchive (MA) we may see novel compounds/ ligands in the future. Those compounds are not necessarily stored in the chemical components dictionary (CCD) of wwPDB. Some of them may be so artificial that they can not be considered to be stored in the wwPDB CCD. To still make novel compounds available in MA, two approaches should be established: Let MA have its own CCD and let ModelCIF file define their own compounds locally, if needed.
That means in the future we should have three kinds of sources for chemical components in ModelCIF files: wwPDB CCD, MA CCD, locally defined. This is facilitated by a new item to the
_chem_comp
category -_chem_comp.ma_provenance
.In case,
ma_provenance
is "CCD local", a new data category must be populated with data -ma_chem_comp_descriptor
, linked back to_chem_comp
via_ma_chem_comp_descriptor.chem_comp_id
.Having that scheme to introduce own compounds to ModelCIF files is a feature we need available for ModelArchive.
After looking into the code a bit, I think maybe having classes inheriting from
ihm.ChemComp
available forma_provenance
"CCD local" and "CCD MA" would be an idea. Then for the ModelCIF file_chem_comp.ma_provenance
could be set depending on the class of the compound or the availability of a certain attribute. Adding_ma_chem_comp_descriptor
seems to be not complicated, but is there a way that having a "CCD local" compound enforces having_ma_chem_comp_descriptor
?Thanks,
B