Closed Ojda22 closed 3 years ago
Could you post more information on what you tried and what are the returned errors? Thanks
I run main.py --cfg configs/example_cpu.yaml --repeat 3
By adding fixed_split=True
here https://github.com/snap-stanford/GraphGym/blob/d207269ae0fbb3493fdb2f1029a96cf8b17a4849/graphgym/loader.py#L74
In order to keep Cora dataset fix split
Here is the stacktrace:
File “~/GraphGym/run/main.py", line 42, in <module>
datasets = create_dataset()
File “~/GraphGym/graphgym/loader.py", line 226, in create_dataset
datasets = dataset.split(
File “~/anaconda3/envs/graph/lib/python3.8/site-packages/deepsnap/dataset.py", line 1079, in split
self._split_transductive(
File “~/anaconda3/envs/graph/lib/python3.8/site-packages/deepsnap/dataset.py", line 735, in _split_transductive
split_graph = graph.split(
File “~/anaconda3/envs/graph/lib/python3.8/site-packages/deepsnap/graph.py", line 1182, in split
return self._split_node(split_ratio, shuffle=shuffle)
File “~/anaconda3/envs/graph/lib/python3.8/site-packages/deepsnap/graph.py", line 1258, in _split_node
graph_new.node_label = self.node_label[nodes_split_i]
IndexError: index 273 is out of bounds for dimension 0 with size 140
I'm not sure if it is the right way to go, or I'm missing something?
Hi, if you try to set fix_split=False
, do you encounter the same error?
Doing so can help me find the bug, thanks!
Hello
With fix_split=False
(default argument) it works.
However, it doesn't work as expected. Let me try to be more precise about the problem when running GraphGym/run/main.py
fix_split : bool
argument that regulates this, but when it is switched to True
, it throws the error ☝️However, running Cora dataset by _GraphGym/run/mainpgy.py it seems that split is performed according to the masks (for those who want to keep fixed size splits)
Hello What you describe is right, thanks for the summary.
By default, GraphGym was using DeepSNAP backend (example in main.py
), which automatically assumes random splitting of datasets. When loading PyG datasets using DeepSNAP backend, the fixed splits will be discarded. That was the default behavior of DeepSNAP
I recently created a PyG backend (example in main_pyg.py
), which adopts the fixed split in the PyG datasets. That backend is PyG native and does not perform conversion to DeepSNAP format.
Hope this can further clarify your questions. Please let me know if you need further help.
Hello,
I am trying to load a dataset and to keep the dataset split, as masks already exist.
I realized there exists an argument that controls this:
However, when I run it, it always propagates with errors. Now I'm not sure whether it is implemented until the end or it is yet to be done.
I would appreciate your help and instructions on how can I accomplish this. Best,