IndexError when running NeuralNet.train()

svenvanderburg commented 2 years ago

When running NeuralNet.train(), @DTRademaker gets:

the threshold for accuracy computation is set to 1
   Checking dataset Integrity
   Processing data set
   Train dataset         : 100%|███████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.02it/s, mol=dataset.hdf5]
Loading clusters
  0%|                                                                                                      | 0/400 [00:00<?, ?it/s]no clustering group found
  0%|                                                                                                      | 0/400 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "first_net.py", line 11, in <module>
    nn = NeuralNet(database, GINet,
  File "/home/daniel/binding_project/deeprank-gnn-2/deeprank_gnn/NeuralNet.py", line 116, in __init__
    self.load_model(database, Net, database_eval)
  File "/home/daniel/binding_project/deeprank-gnn-2/deeprank_gnn/NeuralNet.py", line 186, in load_model
    PreCluster(dataset, method=self.cluster_nodes)
  File "/home/daniel/binding_project/deeprank-gnn-2/deeprank_gnn/DataSet.py", line 89, in PreCluster
    data = community_pooling(cluster, data)
  File "/home/daniel/binding_project/deeprank-gnn-2/deeprank_gnn/community_pooling.py", line 206, in community_pooling
    internal_edge_index, internal_edge_attr = pool_edge(
  File "/home/daniel/.local/lib/python3.8/site-packages/torch_geometric/nn/pool/pool.py", line 13, in pool_edge
    edge_index, edge_attr = remove_self_loops(edge_index, edge_attr)
  File "/home/daniel/.local/lib/python3.8/site-packages/torch_geometric/utils/loop.py", line 42, in remove_self_loops
    return edge_index, edge_attr[mask]
IndexError: The shape of the mask [0] at index 0 does not match the shape of the indexed tensor [2378, 0] at index 0

The code to reproduce this:

from deeprank_gnn.NeuralNet import NeuralNet
from ginet import GINet
from deeprank_gnn.models.metrics import OutputExporter, ScatterPlotExporter

database = '/home/daniel/binding_project/outputAll/dataset.hdf5'

metrics_output_directory = "./metrics"
metrics_exporters = [OutputExporter(metrics_output_directory),
                     ScatterPlotExporter(metrics_output_directory, 5)]

nn = NeuralNet(database, GINet,
               node_feature=['hb_acceptors',
                            'hb_donors',
                            'polarity',
                            'pos',
                            'size',
                            'type'],
               edge_feature=[],
               target='labels',
               task = 'class',
               index=range(400),
               batch_size=64,
               percent=[0.8, 0.2],
               metrics_exporters=metrics_exporters)

nn.train(nepoch=50, validate=False)

(I am still working on getting the actual dataset)

cbaakman commented 2 years ago

This seems to be happening during the clustering step.

gcroci2 commented 2 years ago

Maybe this is not the reason for the error, but edge_features should be None and not an empty list (see here)

svenvanderburg commented 2 years ago

For some reason the resulting torch_geometric dataset has a an empty tensor for:

data.internal_edge_index tensor([], size=(2, 0), dtype=torch.int64)
data.internal_edge_attr tensor([], size=(2360, 0))

Then pool_edge fails with above error message: https://github.com/DeepRank/deeprank-gnn-2/blob/d393f5411cfd68a7bd1f6082efc5344f041456c7/deeprank_gnn/community_pooling.py#L206

I'm guessing something is wrong with the dataset that we don't catch properly. We do check here: https://github.com/DeepRank/deeprank-gnn-2/blob/d393f5411cfd68a7bd1f6082efc5344f041456c7/deeprank_gnn/community_pooling.py#L190 whether we have internal edges, but we don't check if it is an empty tensor. I have no clue in which case we can expect this to be an empty tensor.

@NicoRenaud @DTRademaker @cbaakman any clue?

NicoRenaud commented 2 years ago

I guess if we have only one cluster left there will be no edges anymore. I would check the number of clusters left

svenvanderburg commented 2 years ago

I created a tests that replicates this in #92.

The problem is that the dataset doesn't have internal edges defined, then we add empty tensors here (and in the lines below): https://github.com/DeepRank/deeprank-gnn-2/blob/f78f01fefaa9f30e72c119b174e09f56fb2c4bee/deeprank_gnn/DataSet.py#L331

We check whether the internal edge dataset/key exist here: https://github.com/DeepRank/deeprank-gnn-2/blob/d393f5411cfd68a7bd1f6082efc5344f041456c7/deeprank_gnn/community_pooling.py#L190 But we don't check whether the tensor is empty. And apparently this chokes pool_edge.

I tried checking whether the tensor is not empty:

has_internal_edges = hasattr(data, "internal_edge_index") and data['internal_edge_index'].shape[1] > 0

But then I just later get:

tests/test_nn.py:37: in _model_base_test
    nn = NeuralNet(
deeprank_gnn/NeuralNet.py:117: in __init__
    self.load_model(dataset, Net, dataset_eval)
deeprank_gnn/NeuralNet.py:168: in load_model
    PreCluster(dataset, method=self.cluster_nodes)
deeprank_gnn/DataSet.py:92: in PreCluster
    data.internal_edge_index, data.num_nodes, method=method
../../miniforge3/envs/3dvac3.9/lib/python3.9/site-packages/torch_geometric/data/data.py:362: in __getattr__
    return getattr(self._store, key)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = {'x': tensor([[12.0000,  0.0000,  3.1585,  ..., -4.0000, -3.0000,  6.0000],
        [15.0000,  0.0000, 63.0627,  ..., ...e+00, -8.4491e+00],
        [ 8.7118e+00,  5.7578e+00, -1.3459e+01],
        [-4.9571e+00,  1.0465e+01, -1.1262e+01]])}
key = 'internal_edge_index'

    def __getattr__(self, key: str) -> Any:
        if key == '_mapping':
            self._mapping = {}
            return self._mapping
        try:
            return self[key]
        except KeyError:
>           raise AttributeError(
                f"'{self.__class__.__name__}' object has no attribute '{key}'")
E           AttributeError: 'GlobalStorage' object has no attribute 'internal_edge_index'

../../miniforge3/envs/3dvac3.9/lib/python3.9/site-packages/torch_geometric/data/storage.py:52: AttributeError

It seems like the support for graphs without internal edges is just completely broken, since we add empty tensors in case they don't exist, but then the code chokes on this.

@cbaakman you seemed to have changed some code related to this recently, any ideas on fixing this?

cbaakman commented 2 years ago

In the last class diagram, we merged the "edges" and "internal edges" collections into "edges". We could let the clustering software use "edges" instead.

svenvanderburg commented 2 years ago

OK, so the whole training code still needs to updated for the fact that the internal edges are now just merged with edges? Did you already think about how to change that?

cbaakman commented 2 years ago

OK, so the whole training code still needs to updated for the fact that the internal edges are now just merged with edges? Did you already think about how to change that?

I suppose we change this function, to use the edge_index and edge_attr instead. https://github.com/DeepRank/deeprank-gnn-2/blob/f78f01fefaa9f30e72c119b174e09f56fb2c4bee/deeprank_gnn/community_pooling.py#L159

svenvanderburg commented 2 years ago

OK, and that's it? For the rest the internal edges are not needed? And will it still make sense from deep learning perspective?

cbaakman commented 2 years ago

OK, and that's it? For the rest the internal edges are not needed? And will it still make sense from deep learning perspective?

You'd be clustering based on proximity between nodes. It would make less sense than clustering only on internal edges, which usually represent covalent bonds. If clustering is this important then we do maybe need to use only the edges with the covalent feature set to True.

DeepRank / deeprank2

IndexError when running NeuralNet.train() #89