DeepRank / Deeprank-GNN

Graph Network for protein-protein interface
Apache License 2.0
117 stars 32 forks source link

louvain clustering is not available to network model #58

Closed cbaakman closed 3 years ago

cbaakman commented 3 years ago

I ran the following script, using Deeprank-GNN:

#!/usr/bin/env python

from deeprank_gnn.GraphGen import GraphHDF5
from deeprank_gnn.NeuralNet import NeuralNet
from deeprank_gnn.ginet import GINet
from deeprank_gnn import CustomizeGraph

preprocessed_path = "./preprocessed.hdf5"

GraphHDF5(pdb_path="./pdb", graph_type="residue", outfile=preprocessed_path)
CustomizeGraph.add_target(".", "bin_class", "targets.txt", sep=" ")

NN = NeuralNet(preprocessed_path, GINet,
               node_feature=['type', 'polarity', 'bsa', 'charge'],
               edge_feature=['dist'],
               target='bin_class',
               batch_size=1,
               cluster_nodes='louvain')

NN.train(nepoch=250, validate=False)

PDB files were: 1jyn.pdb 2y69.pdb 5ju6.pdb

The output was:

Loading clusters
  0%|                                                                                                                                                                  | 0/3 [00:00<?, ?it/s]WARNING: no cluster detected
Deleting previous data for mol 1jyn method louvain
/usr/local/lib/python3.8/dist-packages/torch_sparse/storage.py:387: UserWarning: This overload of nonzero is deprecated:
    nonzero()
Consider using one of the following signatures instead:
    nonzero(*, bool as_tuple) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:882.)
  ptr = mask.nonzero().flatten()
WARNING: no cluster detected
Deleting previous data for mol 2y69 method louvain
WARNING: no cluster detected
Deleting previous data for mol 5ju6 method louvain29.67it/s]
Training set loaded
No independent validation set loaded
device set to : cpu
WARNING: no cluster detected
WARNING: no cluster detected
Traceback (most recent call last):
  File "./run.py", line 26, in <module>
    NN.train(nepoch=250, validate=False)
  File "/home/cbaakman/projects/deeprank-gnn/deeprank_gnn/NeuralNet.py", line 294, in train
    _out, _y, _loss, self.data['train'] = self._epoch(epoch)
  File "/home/cbaakman/projects/deeprank-gnn/deeprank_gnn/NeuralNet.py", line 483, in _epoch
    pred = self.model(data_batch)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/cbaakman/projects/deeprank-gnn/deeprank_gnn/ginet.py", line 103, in forward
    cluster = get_preloaded_cluster(data.cluster0, data.batch)
AttributeError: 'Batch' object has no attribute 'cluster0'

Switching to mcl clustering makes it work.

Is Louvain clustering available ?

manonreau commented 3 years ago

Thanks for raising this issue.

Identified issue: when running Deeprank-GNN with the louvain node clustering algorithm, Deeprank-GNN was still expecting mcl cluster data due to a lack of specification of the clustering algorithm used when loading the data (in HDF5dataset)

Solution: Update every call of HDF5dataset with clustering information

HDF5DataSet(root='./', database=database, node_feature=self.node_feature, edge_feature=self.edge_feature,
                  target=self.target, clustering_method=self.cluster_nodes)

This has been solved in PR #59