Closed gxglxy closed 7 months ago
Hi,
We will be adding a notebook soon explaining this and data preparation. But for now, here is a set of steps you can follow.
parquet
and npz
files must be put in the data/PCQM
directory. You can also download them by running the following command:bash download_data.sh
models
directory from the huggingface repository. The raw weights are contained in the model_state.pt
files in the checkpoint directories.python make_predictions.py configs/pcqm/tgt_at_200m/pcqm_dist_pred/tgt_at_100m_rdkit.yaml
This will create a predictions
directory (e.g. bins50
) in the model directory, containing the predictions for the training and validation sets. To reduce the number of distance samples (and thus save time and disk space) add the following argument 'prediction_samples: 10'
(we used 50 samples, you can increase it during the final inference to get better results).
python do_evaluations.py configs/pcqm/tgt_at_200m/pcqm_gap_pred/tgt_at_100m_rdkit.yaml
the results will be printed to the console and also saved in the predictions directory
We have now added instructions for data preparation and inference. Please refer to the README for details.
Hi authors,
Thank you for open-sourcing this cool work! I want to try your models with the provided checkpoints. Could you further provide complete instructions for reproducing the results (e.g., 0.0671 eV/MAE on a valid set) on the OGB-LSC leaderboard with the available checkpoints?
Looking forward to your reply!
Best