Open Linnore opened 1 week ago
I am organising the data preprocessing codes of this repo into a PyG's Dataset object. In the data splitting part of data_loading.py, the test_inds is never used and the test set is assigned as the entire dataset, line 100-106:
test_inds
te_edge_index, te_edge_attr, te_y, te_edge_times = edge_index, edge_attr, y, timestamps ... te_data = GraphData (x=te_x, y=te_y, edge_index=te_edge_index, edge_attr=te_edge_attr, timestamps=te_edge_times )
Is this an intentional design of data splitting? We should split the dataset 0.6/0.2/0.2 right?
Also not that the edges used for validation include both tr_inds and val_inds:
tr_inds
val_inds
e_val = np.concatenate([tr_inds, val_inds]) ...
I am organising the data preprocessing codes of this repo into a PyG's Dataset object. In the data splitting part of data_loading.py, the
test_inds
is never used and the test set is assigned as the entire dataset, line 100-106:Is this an intentional design of data splitting? We should split the dataset 0.6/0.2/0.2 right?