Training details of basic GearNet on Fold prediction task

DeepGraphLearning / GearNet

GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)

MIT License

253 stars 28 forks source link

Training details of basic GearNet on Fold prediction task #31

Closed YanjingLiLi closed 1 year ago

YanjingLiLi commented 1 year ago

Hi, may I ask if you can provide your training details of basic GearNet (without IEConv) on the Fold prediction task? The results of your paper are in page 8: 28.4 42.6 95.3 for test_fold, test_superfamily, test_family.

Is it here https://github.com/DeepGraphLearning/GearNet/blob/main/config/downstream/Fold3D/gearnet.yaml? I didn't find the basic GearNet model architecture in this repo.

Thanks!

Oxer11 commented 1 year ago

Hi, you're right. This is the config for running GearNet on Fold3D. In this config, we turn off the use_ieconv option to use a vanilla GearNet.

YanjingLiLi commented 1 year ago

Thanks! Did you finally use an additional linear layer to convert the graph representation as the output of the model into 1195 (num_classes) classes?

Oxer11 commented 1 year ago

Yes, I use a 3-layer MLP for prediction as defined here. You can find the implementation in torchdrug.tasks.PropertyPrediction.

YanjingLiLi commented 1 year ago

I want to reproduce your work using python script/downstream.py -c config/downstream/Fold3D/gearnet.yaml --gpus [0], but it returns an error:

  File "script/downstream.py", line 17, in <module>
    from gearnet import dataset, model
  File "/lustre07/scratch/liusheng/GearNet/GearNet_test/GearNet/gearnet/dataset.py", line 17, in <module>
    class Fold3D(data.ProteinDataset):
AttributeError: module 'torchdrug.data' has no attribute 'ProteinDataset'

I believe I have torchdrug installed.

Oxer11 commented 1 year ago

Hi, could you provide the version of your torchdrug? For Fold3D, you need to install torchdrug from source.

YanjingLiLi commented 1 year ago

Get it. I used git clone to clone the torchdrug repo.

By the way, how did you train your GearNet_IEConv with hidden size [512, 512, 512, 512, 512, 512] as described in your config? I always encountered the "cuda out of memory error".

Oxer11 commented 1 year ago

I use 40GB A100 for training. If you don't have large enough gpu memory, maybe consider a smaller batch size or smaller mdoel.