What is the meaning of "training from scratch" mentioned in paper ?

susheels commented 3 years ago

Is it training the downstream svm model with 1) random embeddings or 2) no training GNN embeddings 3) using fully supervised trained GNN embeddings.

I want to know if it is any of the above or something different.

yyou1996 commented 3 years ago

Hi @susheels,

It is finetuning GNN without pre-training.

susheels commented 3 years ago

Hi, @yyou1996 , Thanks for the reply. Just a follow up.

Q1) So the fine-tuning is with all training labels or 1% or 10%.

Q2) If it is fine tuning, then is a linear or non linear prediction layer used after GNN.

Q3) Because GraphCL uses SVM as the prediction model, is training from scratch method you use i.e. fine-tuning a GNN a valid baseline? As it might just show the difference between prediction layer performance and not actually the contrastive learning. I say this because svm prediction layer is more powerful than a simple linear layer with SGD. And my experiments give better values using a svm compared to linear model. I think, for GraphCL one should use a linear model for final evaluation. What are your thoughts?

Thanks

yyou1996 commented 3 years ago

A1) Yes. Details please see the different setting descriptions.

A2) It depends on the previous SOTA pipelines we followed. Also we refer you to the experiment setting description part, including which SOTA setting we are based on.

A3) SVM is only used in unsupervised learning, following InfoGraph. I would totally agree with your point that it is a stronger classifier than the linear one, while for fair comparison and competitive performance, here we would keep it the same as that in the previous SOTA.

Hope my answer help, thanks!

Shen-Lab / GraphCL

What is the meaning of "training from scratch" mentioned in paper ? #19