Annotation updates - Githubissues

Ninjani commented 2 months ago

TODO:

Crystal contacts
- [x] add fraction of crystal contacts as part of validation criteria
- [x] add number of atoms to symmetry_mate_contacts instead of residues
- [x] add tests
Binding affinity
- [x] drop IC50
- [x] add to split as prioritization
Split criteria
- [x] add min/max number of pocket residues and interactions for test

github-actions[bot] commented 2 months ago

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
src/plinder/core
__init__.py
src/plinder/data
__init__.py
get_system_annotations.py
splits.py					256-257, 661
src/plinder/data/pipeline
config.py
io.py					154-161, 190
utils.py
src/plinder/data/utils/annotations
aggregate_annotations.py					167, 339, 349, 1212-1213
get_ligand_validation.py
interaction_utils.py					461
ligand_utils.py					330, 426, 1099, 1164, 1214
rdkit_utils.py					397, 399, 414
save_utils.py
src/plinder/eval/docking
utils.py
Project Total

This report was generated by python-coverage-comment-action

OleinikovasV commented 2 months ago

TODO:

Crystal contacts

[ ] add fraction of crystal contacts as part of validation criteria

[ ] add number of atoms to symmetry_mate_contacts instead of residues

[ ] add tests

Binding affinity

[ ] drop IC50

[ ] add to split as prioritization

Split criteria

[ ] add min/max number of pocket residues and interactions for test

[ ] remove system.cif
[x] add unique plip counter

Ninjani commented 2 months ago

* [ ]  remove `system.cif`

* [ ]  add unique plip counter

Was trying these out, and both not immediately trivial:

system.cif removal itself is easy but needs changing final_structure_qc to load receptor and ligand separately. @yusuf1759 would need changing all the complex_paths.

similarity scoring for unique pli, what's the strategy here - do we count each type of interaction once for each residue or just the residues themselves? e.g

system_1: {res1: [hbond, hbond, saltbridge], res2: [hydrophobic, hbond]}
system_2: {res1: [hbond, saltbridge, saltbridge, hydrophobic], res2: [hbond], res3: [hydrophobic]}

do we want similarity to be 2/2 for system_1 vs system_2 since they both share res1 and res2 as interacting residues (irrespective of actual interactions), or we want to compare system_1_res1: {hbond, saltbridge} vs system_2_res1: {hbond, saltbridge, hydrophobic} (i.e taking the set and ignoring the count)? Or both with former being pocket_interacting_qcov and latter being pli_unique_qcov?

OleinikovasV commented 1 month ago

system_1: {res1: [hbond, hbond, saltbridge], res2: [hydrophobic, hbond]}
system_2: {res1: [hbond, saltbridge, saltbridge, hydrophobic], res2: [hbond], res3: [hydrophobic]}
do we want similarity to be 2/2 for system_1 vs system_2 since they both share res1 and res2 as interacting residues (irrespective of actual interactions), or we want to compare system_1_res1: {hbond, saltbridge} vs system_2_res1: {hbond, saltbridge, hydrophobic} (i.e taking the set and ignoring the count)? Or both with former being pocket_interacting_qcov and latter being pli_unique_qcov?

@Ninjani, this is a good question. I like pocket_interacting_qcov for matched residues that are interacting - but as long as there is at least one matched interaction, eg. if there is a 'hydrophobic' vs 'salt bridge' - I do not consider that it would be reasonable to match them. These would be already matched by the "neighbouring" residues metric of pocket_qcov, so, matching interaction type seems more reasonable to me.

The pli_unique_qcov would be the same as pli_qcov but only counting each unique match once.

Ninjani commented 1 month ago

@OleinikovasV I've implemented pli_unique_qcov already, would consider deferring pocket_interacting_qcov and the removal of system.cif to a later stage so we can do the rerun and have the new test set asap.

plinder-org / plinder

Annotation updates #14

Coverage report