cszhangzhen / HGP-SL

Hierarchical Graph Pooling with Structure Learning
326 stars 54 forks source link

About the acc from this repo and the paper #8

Open lygztq opened 3 years ago

lygztq commented 3 years ago

Hi, your work is excellent. However I find a gap between results obtained via your code and results reported in your paper. Specifically:

Mutagenicity NCI109 NCI1 DD
Code 79.68(1.68) 73.86(1.72) 76.29(2.14) 75.46(3.86)
Paper 82.15(0.58) 80.67(1.16) 78.45(0.77) 80.96(1.26)

I follow your hyper-parameter settings and do not change any part of your code. Why? I'm using Python 3.7 with the latest version of pytorch and pytorch_geometric. Is it possible that unmatched version of pyg causes such a gap?

cszhangzhen commented 3 years ago

Hi,

This is very strange. Let me check it.

ScottLiao920 commented 3 years ago

Same for the PROTEINS dataset, only able to achieve 72% accuracy (same code, same version, same config) compared to 80%+ in the paper. Are there any other hyperparameters tuned to get 80% accu?

cszhangzhen commented 3 years ago

Hi,

Sorry for the late reply.

I found this is caused by the updating of torch-sparse. After the release of torch-sparse 0.4.4 (reference here), spspmm does not support autograd. However, our model needs to learn the sparse adjacent matrix. I suggest to use older version of torch-sparse.

lygztq commented 3 years ago

Actually, I also tried your 'full-graph' version (which has no spspmm operation) and got similar results...

Mutagenicity NCI109 NCI1 DD
Reported in Paper 82.15(0.58) 80.67(1.16) 78.45(0.77) 80.96(1.26)
Author's Code (full graph) 78.44(2.10) 74.44(2.05) 77.37(2.09) OOM
Author's Code (sample) 79.68(1.68) 73.86(1.72) 76.29(2.14) 75.46(3.86)

Also, it is worth pointing out that these three options --sample_neighbor, --sparse_attention and --structure_learning are useless since python will treat all non-zero strings as True.

cszhangzhen commented 3 years ago

These three options --sample_neighbor, --sparse_attention, --structure_learning are defined as bool type, so they should be initialized as True or False.

I also find the performance drops when updating pytorch_geometric to the latest version, and I'm not sure whether this is merely caused by spspmm. I will check this in detail.

lygztq commented 3 years ago

Thanks for the reply :)

For bool arguments in Python's argparser, you can find an explanation here.

cszhangzhen commented 3 years ago

Hi,

I have tested the code.

python main.py --sample_neighbor False

And, the following code is inserted into the main.py.

print(args.sample_neighbor)

It'll output the False statement.

Anwar-Said commented 3 years ago

Is it possible to reproduce the results on Proteins dataset with the latest torch_sparse version?

cszhangzhen commented 3 years ago

Hi, I'm afraid not, since the spspmm do not support auto-grad.

Anwar-Said commented 3 years ago

ok, thanks for your response.

Anwar-Said commented 3 years ago

Hi, Would be glad if someone shares the best parameter setting for reproducing results on the graph classification datasets?

GageDeZoort commented 1 year ago

@cszhangzhen @lygztq @Anwar-Said It may now be possible for the gradients to propagate through the TwoHopNeighborhood call, using something like this:


class TwoHopNeighborhood(object):
    def __call__(self, data):
        edge_index, edge_attr = data.edge_index, data.edge_attr
        n = data.num_nodes

        fill = 1e16
        value = edge_index.new_full((edge_index.size(1),), fill, dtype=torch.float, requires_grad=True)
        index = torch.sparse_coo_tensor(edge_index, value, (n,n)).coalesce()
        c = torch.sparse.mm(index, index).coalesce()
        row, col = c.indices()[0], c.indices()[1]
        index = torch.stack([row, col], dim=0)
        value = c.values()

        edge_index = torch.cat([edge_index, index], dim=1)
        if edge_attr is None:
            data.edge_index, _ = coalesce(edge_index, None, n, n)
        else:
            value = value.view(-1, *[1 for _ in range(edge_attr.dim() - 1)])
            value = value.expand(-1, *list(edge_attr.size())[1:])
            edge_attr = torch.cat([edge_attr, value], dim=0)

            data.edge_index, edge_attr = coalesce(edge_index, edge_attr, n, n, op='min')
            edge_attr[edge_attr >= fill] = 0
            data.edge_attr = edge_attr

        return data

    def __repr__(self):
        return '{}()'.format(self.__class__.__name__)

I'm still not seeing the results quoted in the paper, but values close to them, provided I use the hyperparameters given in the default main.py.

cszhangzhen commented 1 year ago

@GageDeZoort Thanks for your code. I will try it. Thanks.

GageDeZoort commented 1 year ago

Awesome, let me know if it helps - I'm very interested in training the structure learning component and reproducing your excellent results!

mperozek11 commented 1 year ago

@GageDeZoort Thanks so much for posting! Your fix seems to be working for me. Training is extremely slow relative to simple GCN even with similar number of parameters. Is this expected? @cszhangzhen