Bug Fix: Preprocessing and Training with the OGB PCQM4Mv2 Dataset

We have identified several issues with the original implementation in examples/ogb/train_gap.py that require modifications:

Incompatibility with the Current OGB PCQM4Mv2 Dataset: The current version of the OGB PCQM4Mv2 dataset includes atoms not listed in ogb_node_types and contains entries with empty labels. We skipped these incompatible entries in the preprocessing code.
Failed Instantiation of AdiosDataset: The code currently instantiates AdiosDataset with incompatible parameters. The opt dictionary should be unpacked before being passed as arguments.
Broadcasting Over 2GB Data with MPI: The Adios_writer class occasionally attempts to broadcast over 2GB of data, exceeding the MPI message count limit. We have implemented a chunk-based broadcasting function to address this issue.

These bug fixes are essential for later integrating our DeepSpeed and pipeline-parallelism implementations, which use the OGB PCQM4Mv2 dataset as an example.

ORNL / HydraGNN

Bug Fix: Preprocessing and Training with the OGB PCQM4Mv2 Dataset #262