schrodinger / pymol-open-source

Open-source foundation of the user-sponsored PyMOL molecular visualization system.
https://pymol.org/
Other
1.15k stars 275 forks source link

Solvent selector issue (oxygen atom appearing as water) #326

Open ioChris opened 8 months ago

ioChris commented 8 months ago

PyMOL 2.5.0 Open-Source, 2024-01-07 (installed through conda-forge) Also tested in PyMOL 2.5.0 Open-Source (9ea504ea8d), 2021-05-28 (installed through pip)

In structure 1a65, the following selection results in 0 atoms: fetch 1a65 select oxygens, 1a65 and resn O and not solvent

However, the oxygen is indeed not annotated as HOH in the mmCIF file. HETATM 3872 O O . O G 5 . ? 7.445 40.703 25.162 1.00 26.99 ? 904 O A O 1

The oxygen is represented as a water molecule in the GUI. My understanding was that the solvent operator recognizes certain residue names (HOH, H2O, etc.) and registers them as solvent, but this behavior suggests that this is not the case (since the residue name is "O" in this case).

JarrettSJohnson commented 8 months ago

Seems like here, some lone oxygen atoms were considered as solvent: https://github.com/schrodinger/pymol-open-source/blob/master/layer3/Selector.cpp#L1269

I can give it some thought how to treat the oxygen atom as not water.

ioChris commented 7 months ago

I'm not sure if it is any help, but in the mmCIF files from RCSB, the residue group identifier ('_atom_site.label_comp_id ' for the PDB label and '_atom_site.auth_comp_id' for the author label) in the case of solvent is typically HOH (there might be potential aliases to 'HOH', like 'H2O', that I do not recall witnessing however). If an oxygen atom is assigned 'O' in the residue identifier, it should mean that it has been annotated as a separate entity by the author (not part of a residue, larger ligand, peptide or solvent group).

In the same structure I quote in the original report (1a65), I am pasting two lines from the mmCIF where two different oxygen atoms can be seen (including the one mentioned earlier and incorrectly annotated as solvent). The first one is a stand-alone atom (ligand), and the second one is part of a 'HOH' group.

HETATM 3872 O  O   . O   G 5 .   ? 7.445   40.703 25.162 1.00 26.99 ? 904  O   A O   1 
HETATM 3873 O  O   . HOH H 6 .   ? 19.509  36.893 30.054 1.00 13.07 ? 905  HOH A O   1 

It should be clear from the residue group in the mmCIF (both PDB and author labels) that the first atom is not part of a solvent group, and that the second one is. Maybe this inherent description in the mmCIF can be used for distinguishing solvent and non-solvent atoms (my initial assumption was that it was used in this manner).