microsoft / Graphormer

Graphormer is a general-purpose deep learning backbone for molecular modeling.
MIT License
2.08k stars 334 forks source link

3D graphormer model on PCQM4M-v2 #135

Open Sangyeup opened 2 years ago

Sangyeup commented 2 years ago

Hi.

It seems like 3D graphormer code now is only for the oc20 dataset. Am I right?

If I am right, do you have a plan to upload 3D graphormer model code for PCQM4M-v2 dataset?

Thank you.

mavisguan commented 2 years ago

Hi! As for 3D pretrained models, we only released a model pretrained on the oc20 dataset. But the Graphormer 3D model we provided (in graphormer/models/graphormer_3d.py) can be trained on any 3D molecular datasets, including PCQM4M-v2 dataset. You can train it using a training script similar to examples/oc20/oc20.sh.

Sangyeup commented 2 years ago

I see, then where can I find a preprocess code to convert PCQM4M-v2 dataset into a format which fits graphormer_3d.py?

def forward(self, atoms: Tensor, tags: Tensor, pos: Tensor, real_mask: Tensor): padding_mask = atoms.eq(0)

Code above seems like dataset needs features like atoms / tags / pos. Right?

I only could find preprocess code that deals with oc20 dataset in Graphormer/graphormer/tasks/is2re.py.

mavisguan commented 2 years ago

Yes, if you want to train a Graphormer 3d model, you'll need 3d position information for sure. Atoms or tags information is optional, but it's suggested to include atom types if you expect good results. You can use preprocess code like that in graphormer/tasks/is2re.py.