Would the data augmentation be label-preserving?

Shen-Lab / GraphCL

[NeurIPS 2020] "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen

MIT License

555 stars 103 forks source link

Would the data augmentation be label-preserving? #15

Closed ha-lins closed 3 years ago

ha-lins commented 3 years ago

Hi @Yuning You,

As the title shows, I have a question about the data augmentation in unsupervised_graph_TU experiments. If we drop an edge or a node and that edge happens to be in a structural motif, it will drastically change the attributes/labels of the molecule. Could you pls give some explainations? Thanks!

yyou1996 commented 3 years ago

Hi @ha-lins,

Thanks for your comments. Augmentation represents our prior belief in the data, that what kind of perturbation is rational in the sense of distribution (see Table 1 in paper). Thus, improper augmentation (prior) will lead to deterioration in performance as you imagine, as illustrated in Figure in paper.

ha-lins commented 3 years ago

Thanks for your response. I agree that improper augmentations couldn't be label-preserving and lead to performance degradation.

Besides, have you ever tried different augmentation ratios(except for 10%) on the unsupervised_TU experiments? I think it could be an important hyper-parameter and is about the label-preserving property.

I also find that the GraphCL could not always outperform the Infographmethod over all graph classification datasets significantly. I suggest that more experiments on larger benchmarks such as OGB could be more convincing.

yyou1996 commented 3 years ago

We have some OGB results of GraphCL recently in my current project. I will update them gradually. Really appreciate your follow-up discussion.

ha-lins commented 3 years ago

Thanks a lot!