Seed reproducibility issue

jianhao2016 / AllSet

This is the GitHub repository for our ICLR22 paper: "You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks"

MIT License

93 stars 12 forks source link

Seed reproducibility issue #11

Closed levtelyatnikov closed 1 year ago

levtelyatnikov commented 1 year ago

I have tried to make the experiment reproducible within multiple runs, in particular i have made fixed splits and put and added seed before model initialization, see the code:

 ### Training loop ###
    runtime_list = []
    for run in tqdm(range(args.runs)):
        start_time = time.time()
        split_idx = split_idx_lst[run]
        train_idx = split_idx['train'].to(device)

        # Seed initialization
        torch.manual_seed(seed=0)
        pl.seed_everything(seed=0)
        np.random.seed(seed=0)
        model.reset_parameters()

however running multiple time the same CLI command with AllDeepSets and AllSetTransformer gives different results, while all the other models actually provide the same results, can you please give a clue what is the issue?

Thank you for you answers and help

elichienxD commented 1 year ago

Hi @levtelyatnikov ,

My guess is that this is an issue of the older PyG version, which unfortunately is what we built on. To my understanding, the scatter function running on GPU in the older PyG version has some intrinsic randomness in it. I forget the details but I also encounter this issue before (i.e., maybe some rng seed for C++ also need to be fixed). I don't know if the PyG author has fixed this issue or not for their latter version. At the moment, if you want to ensure the "exact" same result, you may run it with CPU and that should resolve your problem.

Still, I believe that you should get very similar results to our reported numbers. Please let me know if there is a huge discrepancy.

Best, Eli

levtelyatnikov commented 1 year ago

Your hint was correct, running on cpu allows to make result deterministic, however it slow downs development process. I am experimenting with some extensions to the AllDeepSets and AllSetTransformer models, so it would be great to solve this issue somehow

elichienxD commented 1 year ago

Hi @levtelyatnikov ,

You may want to check this PyG issue and this PyG document.

It seems like if you manage to make every scatter operation into sparse matrix product, then you can resolve the non-deterministic issue. It is simple in standard GNN but I'm not sure if it is easy for our AllSet framework.

Best, Eli