divelab / GOOD

GOOD: A Graph Out-of-Distribution Benchmark [NeurIPS 2022 Datasets and Benchmarks]
https://good.readthedocs.io/
GNU General Public License v3.0
180 stars 19 forks source link

How to obtain results of multiple runs #5

Closed GentleZhu closed 1 year ago

GentleZhu commented 2 years ago

Hi GOOD Team,

Thanks for the great library. I have been successfully ran goodtg but I found it only runs for one time and reports the best epoch on validation. What's the best way to reproduce paper's result on 10 random runs ?

CM-BF commented 2 years ago

Hi, GentleZhu,

The 10 random runs can be run by setting the exp_round argument. The 10 random runs in the paper are under the setting of exp_round from 1 to 10.

Please let me know if any questions.

GentleZhu commented 2 years ago

I found even I set this parameter and config.exp_round>1, the goodtg or load_task function only run for one round.

GentleZhu commented 2 years ago

Your code seems to only store different round in a separate folder under storage.

LFhase commented 2 years ago

Hi Gentle Zhu and GOOD team, I have a similar question. It seems we have to manually check each folder from different rounds. Is there any convenient way to aggregate the results?

CM-BF commented 2 years ago

Hi!

I found even I set this parameter and config.exp_round>1, the goodtg or load_task function only run for one round.

That's true because we generally run goodtg in parallel. That is we use a simple script to launch all rounds simultaneously on different GPUs.

For example, you may generate the following commands, and pack them into a list cmd_args.

goodtg --exp_round 1 --gpu_idx 0--config_file XXX
goodtg --exp_round 2 --gpu_idx 1--config_file XXX
...
goodtg --exp_round 10 --gpu_idx 9 --config_file XXX

After that, you may find the use of package subprocess helpful.

cmd_args = [XXX, ..., XXX]
subprocess.Popen(shlex.split(cmd_args), close_fds=True, stdout=open('debug_out.log', 'a'), stderr=open('debug_error.log', 'a'), start_new_session=False)

I believe the way to launch your programs on GPUs also depends on your experiment environment (If one is sharing computation resources with others, one cannot launch one's programs aggressively).

Because the results are fully stored, you can aggregate all results after finishing running.

BTW, if you only need to run experiments sequentially, you may find reproduce_round1 useful.

It seems we have to manually check each folder from different rounds. Is there any convenient way to aggregate the results?

Since the log file saving paths (structures) are fully based on your config parameters according to log settings, we don't need to manually check the outcomes. After experiments are completed, another script is needed to read all results. Note that to facilitate reading these results, there is a special line at the end of each log file as the result conclusions.

You can set the --log_file param1_param2_param3 to store results for different hyper-parameters.

(Another way to aggregate results is to read the information stored in model checkpoints. (uncommon))

We will share some convenient scripts after we reorganize these scripts for more general purposes.

Please let me know if any questions.:smile:

CM-BF commented 1 year ago

Hi GentleZhu,

We have updated this project to version 1. You can now launch multiple jobs and collect their results easily. Please refer to the new README.

Please let me know if any questions.