awslabs / dgl-lifesci

Python package for graph neural networks in chemistry and biology
Apache License 2.0
714 stars 147 forks source link

Question about pretraining dataset #188

Closed kexinhuang12345 closed 2 years ago

kexinhuang12345 commented 2 years ago

Hi! Thanks for this great package.

I am wondering for the "gin_supervised_contextpred" and "gin_supervised_masking" pre-trained models, are they pre-trained first on the ChEMBL dataset in a supervised manner? If so, where can we find the list of bioassays that is used to pre-train? Thank you!

mufeili commented 2 years ago

Hi Kexin,

The ChEMBL dataset can be accessed by downloading "chem data" here. I'm not sure if it includes the information you needed. If not, you may directly contact the authors of the paper.

kexinhuang12345 commented 2 years ago

Great, thank you Mufei!