Batching Graphs in PygPCQM4MDataset

Hi thanks for preparing the processing code!

Was just thinking if batching the smiles graph into separate torch files would be a feasible solution to reduce memory requirement? I notice in the process() function of class PygPCQM4MDataset(InMemoryDataset):, the list of graphs obtained from the smiles strings are all combined into a single dataset, and subsequently torch.save'd into one file (only to be split again later on to different dataloaders? during training and testing)

Since all of the graphs are independent of each other, would it be possible to perhaps save these into a couple of torch files, each made of batches of several graphs data to reduce RAM requirement?

Thanks!

snap-stanford / ogb

Batching Graphs in PygPCQM4MDataset #148