The main purpose of this PR is to adapt the repo such that experiments can be run on Google Colab. These adaptations will likely be removed from the final version as many people do not have access to Colab.
The secondary purpose is to add the experiment scripts which we reported in the paper.
This PR also adds model checkpointing, which is crucial for longer training runs.
Finally, this PR changes the preprocessed data storage format from a directory of .json files to a single .jsonl file. We made this change because QM9 used to result in 130K .json files, which are both inefficient to iterate over and use a lot of resources on the cluster.
Description
.json
files to a single.jsonl
file. We made this change because QM9 used to result in 130K.json
files, which are both inefficient to iterate over and use a lot of resources on the cluster.List of commits
torch.compile
: ad228f4README
with instruction to run experiments: d140c88.jsonl
: 52e0e3dNotes
torch_scatter
with a customscatter_add
to remove the dependency, but ultimately brought back thetorch_scatter
version.