snap-stanford / GraphGym

Platform for designing and evaluating Graph Neural Networks (GNN)
Other
1.69k stars 185 forks source link

Just the validation performance reported? #28

Closed Theheavens closed 3 years ago

Theheavens commented 3 years ago

I see that 7.2 Experimental Setup in paper

For all the experiments in Sections 7.3 and 7.4, we use a consistent setup, where results on three random 80%/20% train/val splits are averaged, and the validation performance in the final epoch is reported.

In Sections 7.3 and 7.4, the performance used in ranking analysis are all validation performance. First, the validation performance in the final epoch meanss it will run out of all epochs, which is final epoch, am I right? Second, I am wondering why we don't use the early-stop and use the test performance mentioned below, which is the best validation epoch test performance.

how to report the performance (e.g., final epoch or the best validation epoch) in section 7.1.

JiaxuanYou commented 3 years ago

In Sections 7.3 and 7.4, the performance used in ranking analysis are all validation performance. First, the validation performance in the final epoch meanss it will run out of all epochs, which is final epoch, am I right?

Yes, that is how the experiment was done in our "Design Space for GNN" paper. The main reason is that we wanted to inspect the effect of training epochs as well. If early stopping is use, the training epoch parameter does not make a difference.

Second, I am wondering why we don't use the early-stop and use the test performance mentioned below, which is the best validation epoch test performance.

In practice, we should focus on the epoch where the validation performance is the best. This is conveniently provided in GraphGym in "test_best.csv".