ihmwg / python-ihm

Python package for handling IHM mmCIF and BinaryCIF files
MIT License
14 stars 7 forks source link

Support for chemical component dictionary #132

Open brindakv opened 6 months ago

brindakv commented 6 months ago

Consider adding support to check for existing chemical component ids and their nomenclature in the wwPDB chemical component dictionary.

With more depositions with ligands and branched chain entities, it will be useful if python-ihm can support the definitions in the CCD.

benmwebb commented 6 months ago

@brindakv Do you mean to add a function which, given an ID like GLY, will query the CCD API to fill in formula, weight, full name, etc.? Does Ligand Expo provide a suitable API?

brindakv commented 6 months ago

@benmwebb Yes, that would be one of the requirements.

In addition to formula weight, full name, etc., (chem_comp category) we also need a mechanism to check the atom nomenclature (if the model is atomic scale) and make sure that it is consistent with the CCD (chem_comp_atom category). This was not an issue until recently. The small molecules were always from an existing starting model in the PDB.

I agree a Ligand Expo API would be very useful. I will find out and update.

brindakv commented 6 months ago

@benmwebb Here's an update on using RCSB.org APIs to get chemical component information.

Search API example: https://search.rcsb.org/query-editor.html?json=%7B%22query%22%3A%7B%22type%22%3A%22terminal%22%2C%22label%22%3A%22text_chem%22%2C%22service%22%3A%22text_chem%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_chem_comp_descriptor.InChIKey%22%2C%22operator%22%3A%22exact_match%22%2C%22negation%22%3Afalse%2C%22value%22%3A%22OKPCXGMPQJNPGA-HGMAEFONSA-L%22%7D%7D%2C%22return_type%22%3A%22mol_definition%22%2C%22request_options%22%3A%7B%22paginate%22%3A%7B%22start%22%3A0%2C%22rows%22%3A25%7D%2C%22results_content_type%22%3A%5B%22experimental%22%5D%2C%22sort%22%3A%5B%7B%22sort_by%22%3A%22score%22%2C%22direction%22%3A%22desc%22%7D%5D%2C%22scoring_strategy%22%3A%22combined%22%7D%7D

Data API example: https://data.rcsb.org/graphql/index.html?query=%7B%0A%20%20chem_comp(comp_id%3A%20%22522%22)%20%7B%0A%20%20%20%20pdbx_chem_comp_descriptor%20%7B%0A%20%20%20%20%20%20type%0A%20%20%20%20%20%20descriptor%0A%20%20%20%20%20%20program%0A%20%20%20%20%7D%0A%20%20%20%20pdbx_chem_comp_identifier%20%7B%0A%20%20%20%20%20%20type%0A%20%20%20%20%20%20identifier%0A%20%20%20%20%20%20program%0A%20%20%20%20%7D%0A%20%20%7D%0A%7D%0A

Cif file for chemical components: https://files.rcsb.org/ligands/download/522.cif

We can only get chemical components from released entries.