Open josemduarte opened 8 years ago
The inclusion of FOL is a bug. It is wrongly recognised as an amino acid. The assignment of what an amino acid is needs to be enhanced to fix this. As a workaround I added FOL to a list of group names that are not amino acids.
The polymer
selection keyword is (currently) just a shortcut for protein or dna or rna
. I agree that polymer should be more specific and only select members of polymeric chains.
When available, the chemCompType is now used (78b0b3757e5d93515bd6c7194a03d035e0558d59) for determining the molecule type (e.g. protein, dna, rna). This should remove most false positives. When support for entity data lands (#61), it may also be used for determining polymers.
When available the entityType is now used to select polymers. Entities currently parsed from MMTF and mmCIF files (eafd38b81599fd3addc891a56e4dc77a8c7e72e8).
With #61 done, polymers are correctly recognised for standard PDB, mmCIF, MMTF files.
Pending now, is support for recognising polymer chains in other files types (GRO, PSF, MOL2) and non standard PDB, mmCIF, MMTF files that lack polymer/entity annotation.
For instance in PDB 4jjk, for ligand molecule FOL. If I use selection string:
:A and polymer
, the FOL molecule is included in the selection, when it shouldn't because it is not part of the protein chain.