Closed hxu105 closed 1 year ago
Hi! Thanks for raising this issue!
By default, we use 4 A100 GPU for pretraining and finetuning our model. The batch size per GPU on the downstream tasks is set to 2, which is equivalent to batch_size=24=8 on a single GPU. This hyperparameter may have large influence on your final results.
I suggest to follow the default setting to use 4 GPUs. If so, the F1 max should get 0.8 on EC in 20 epochs. If you want to run the model on a single GPU, maybe you can set the batch size as 8 and lower the hidden dimension. Though I haven't tried this setup, I believe that it can still get good performance.
Thanks for the answering, will try to reproduce the experiments again with more GPUs.
Hi! I recently find that the scheduler is important for the performance. I've added it back into the codebase in 437333f and updated a config file for single gpu on EC, which can reproduce the results in the paper.
I re-run the code after fixing the scheduler issue and attach the log files here on EC with single gpu for your reference. gearnet_edge_ec_1gpu.txt
Hi, I am trying to reproduce the experiments, but the reproduced results have large gaps between the paper results. Reproduced: GearNet: EC: 0.514 (200 epochs) GO-BP: 0.176 (146 epochs) GO-CC: 0.145 (84 epochs) GearNet-Edge: EC: 0.404 (163 epochs) GO-BP: 0.255 (100 epochs) GO-CC: 0.163 (107 epochs)
I use the same configuration and hyperparameter as provided in the rep. Training runs on one single GPU, and the some of the experiments are still under training.
Many thanks