williamleif / GraphSAGE

Representation learning on large graphs using stochastic graph convolutions.
Other
3.41k stars 841 forks source link

Can I adapt the code to multi-label classificaiton? #57

Open zhiqiangzhongddu opened 5 years ago

zhiqiangzhongddu commented 5 years ago

Hi,

I have a multi-label classification problem, where one node can have multi labels, Do I need to change the code for multi-class classification? thanks.

RexYing commented 5 years ago

You only need to change the loss function. Rather than one softmax followed by CE/max margin, you have several of these heads, and sum over the loss.

I think it makes sense to use the same graphsage embedding to predict multiple labels, unless you have clear prior knowledge that each label should be based on very different sources of information and you want to learn each separately.

zhiqiangzhongddu commented 5 years ago

Thanks, that makes sense. Did you try with any multi-label dataset? Refer to mine, and my friends result that SAGE doesn't work well as expected for multi-label classification task.

RexYing commented 5 years ago

What's the baseline that you use to compare with GraphSAGE? Multi-label task could be harder if an instance is correctly classified only when all the labels are correctly classified.

Did you compare with baseline that's also a mult-label classification algorithm?

I do not yet see why multi-label tasks are less suitable for GraphSAGE.

zhiqiangzhongddu commented 5 years ago

Sorry for my late. I use doc2vec + logistic regression as the baseline. It means I apply doc2vec on node attributes (words) to work as node embedding. For SAGE, I use the embedding output from doc2vec as node attribute. But after SAGE, node embedding works less good compared to the result of doc2vec, with the same classifier.

This is what confuses me. I try to tune the graph structure (number of layers, batch size, embedding size etc.) but doesn't work. There might be two reasons: one is my network is not suitable for SAGE, second is SAGE works not good for multi-label.