Closed rimmartin closed 2 months ago
If you run setup_entities()
after reading the file, the entities in both mmCIF and PDB should be arranged similarly (apart for different naems for subchains).
>>> st = gemmi.read_structure('/data/structures/divided/pdb/r4/pdb8r4q.ent.gz')
>>> st.setup_entities()
>>> st.entities[0]
<gemmi.Entity 'A' polymer polypeptide(L) object at 0x55a73eacce80>
>>> _.subchains
['Axp', 'Cxp', 'Exp', 'Gxp', 'Ixp', 'Kxp']
Can I trust single capital first letter and ignore the xp or other postscripts such as x1 x2?
It's better to check Entity::subchains as you did.
Hi @wojdyr ,
Been running 8r4q.pdb & 8r4q.cif thru alignment for finding sequence gaps. cif entity names are the _entity.id while for pdb the chain id. So started using the Entity::subchains to find the chain id's
cif gives subchain letters like
while pdb yields
Can I trust single capital first letter and ignore the xp or other postscripts such as x1 x2? Or is there a better parse of these subchain strings?