a-r-j / graphein

Protein Graph Library
https://graphein.ai/
MIT License
1.02k stars 129 forks source link

NaN coordinates when granularity: centroids and on small residue graphs #219

Closed manonreau closed 1 year ago

manonreau commented 1 year ago

Describe the bug Some graphs have NaN coordinates. This has been particularly observed for graphs missing aromatic residues WHEN aromatic interactions are listed in the edge_construction_functions, and when "granularity": "centroids"

Different reasons have been identified:

To Reproduce

from graphein.protein.graphs import construct_graph
from graphein.protein.config import ProteinGraphConfig
from graphein.protein.edges.distance import (add_peptide_bonds,
                                             add_hydrogen_bond_interactions,
                                             add_disulfide_interactions,
                                             add_ionic_interactions,
                                             add_aromatic_interactions,
                                             add_aromatic_sulphur_interactions,
                                             add_cation_pi_interactions,
                                             add_distance_threshold
                                            )

new_edge_funcs = {"edge_construction_functions": [add_peptide_bonds,
                                                  add_aromatic_interactions,
                                                  add_hydrogen_bond_interactions,
                                                  add_disulfide_interactions,
                                                  add_ionic_interactions,
                                                  add_aromatic_sulphur_interactions,
                                                  add_cation_pi_interactions,
                                                  partial(add_distance_threshold, long_interaction_threshold=2, threshold=20.)]}

params_to_change = {"granularity": "centroids"}
config = ProteinGraphConfig(**new_edge_funcs, **params_to_change)

pocket_name = 'xxxx.pdb'
g = construct_graph(config=config, pdb_path=pocket_name)

Expected behavior Several checks should be added to ensure the correct computation of centroids coordinates when no atoms nor residues are missing.