ihmwg / python-modelcif

Python package for handling ModelCIF mmCIF and BinaryCIF files
MIT License
10 stars 1 forks source link

_entity_poly.pdbx_strand_id for homomers #23

Closed bienchen closed 2 years ago

bienchen commented 2 years ago

When writing ModelCIF files for homomers, it looks like _entity_poly.pdbx_strand_id is only pointing to the first asymmetric unit/ chain. The following code creates two chains from the same modelcif.Entity:

import modelcif
import modelcif.dumper
system = modelcif.System()
entity = modelcif.Entity("GHMKYPVEGGGNQ")
au1 = modelcif.AsymUnit(entity)
au2 = modelcif.AsymUnit(entity)
assembly = modelcif.Assembly((au1, au2))
system.assemblies.append(assembly)
with open("entity_poly_pdbx_strand_id_example.cif", "w") as cif_fh:
    modelcif.dumper.write(cif_fh, [system])

_entity_poly in the file written looks like this:

_entity_poly.entity_id
_entity_poly.type
_entity_poly.nstd_linkage
_entity_poly.nstd_monomer
_entity_poly.pdbx_strand_id
_entity_poly.pdbx_seq_one_letter_code
_entity_poly.pdbx_seq_one_letter_code_can
1 polypeptide(L) no no A GHMKYPVEGGGNQ GHMKYPVEGGGNQ

So _entity_poly.pdbx_strand_id is "A". Comparing with PDB entries, for a homo-2-mer, _entity_poly.pdbx_strand_id should be "A,B", e.g. entry 1SJ2 does it like this.

benmwebb commented 2 years ago

Yes, python-ihm explicitly writes the first ID: https://github.com/ihmwg/python-ihm/blob/0.31/ihm/dumper.py#L552-L556

I assumed at the time that this required a single ID (it isn't called pdbx_strand_ids) but it would be easy to fix.

bienchen commented 2 years ago

Can confirm, now it works as I thought, thanks!