DeepGraphLearning / GearNet

GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)
MIT License
253 stars 28 forks source link

ValueError: Unknown value `CHI_SQUAREPLANAR`. Available vocabulary is `range(0, 4)` #5

Closed yyou1996 closed 1 year ago

yyou1996 commented 1 year ago
15:52:05   Config file: ./config/downstream/GO-BP/gearnet_yy.yaml
15:52:05   {'dataset': {'branch': 'BP',
             'class': 'GeneOntology',
             'path': '/scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/',
             'test_cutoff': 0.95,
             'transform': {'class': 'ProteinView', 'view': 'residue'}},
 'engine': {'batch_size': 2, 'gpus': [0], 'log_interval': 1000},
 'metric': 'f1_max',
 'optimizer': {'class': 'AdamW', 'lr': 0.0001, 'weight_decay': 0},
 'output_dir': '/scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein_output/downstream/GO-BP',
 'task': {'class': 'MultipleBinaryClassification',
          'criterion': 'bce',
          'graph_construction_model': {'class': 'GraphConstruction',
                                       'edge_feature': 'gearnet',
                                       'edge_layers': [{'class': 'SequentialEdge',
                                                        'max_distance': 2},
                                                       {'class': 'SpatialEdge',
                                                        'min_distance': 5,
                                                        'radius': 10.0},
                                                       {'class': 'KNNEdge',
                                                        'k': 10,
                                                        'min_distance': 5}],
                                       'node_layers': [{'class': 'AlphaCarbonNode'}]},
          'metric': ['auprc@micro', 'f1_max'],
          'model': {'batch_norm': True,
                    'class': 'GearNet',
                    'concat_hidden': True,
                    'hidden_dims': [512, 512, 512, 512, 512, 512],
                    'input_dim': 21,
                    'num_relation': 7,
                    'readout': 'sum',
                    'short_cut': True},
          'num_mlp_layer': 3},
 'train': {'num_epoch': 200}}
15:52:05   Downloading https://zenodo.org/record/6622158/files/GeneOntology.zip to /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/GeneOntology.zip
15:53:38   Extracting /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/GeneOntology.zip to /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO
15:53:41   Extracting /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/GeneOntology/train.zip to /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/GeneOntology
15:56:21   Extracting /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/GeneOntology/valid.zip to /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/GeneOntology
15:56:37   Extracting /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/GeneOntology/test.zip to /scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/protein-datasets/downstream/GO/GeneOntology

Constructing proteins from pdbs:   0%|          | 0/36635 [00:00<?, ?it/s]/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/protein.py:213: UserWarning: Unknown residue `HOH`. Treat as glycine
  warnings.warn("Unknown residue `%s`. Treat as glycine" % type)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py:42: UserWarning: Unknown value `HOH`
  warnings.warn("Unknown value `%s`" % x)
[15:56:55] Explicit valence for atom # 6 O, 3, is greater than permitted
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/protein.py:213: UserWarning: Unknown residue `BIS`. Treat as glycine
  warnings.warn("Unknown residue `%s`. Treat as glycine" % type)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py:42: UserWarning: Unknown value `BIS`
  warnings.warn("Unknown value `%s`" % x)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/protein.py:213: UserWarning: Unknown residue `EPE`. Treat as glycine
  warnings.warn("Unknown residue `%s`. Treat as glycine" % type)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py:42: UserWarning: Unknown value `EPE`
  warnings.warn("Unknown value `%s`" % x)

Constructing proteins from pdbs:   0%|          | 3/36635 [00:00<54:08, 11.28it/s]/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/protein.py:213: UserWarning: Unknown residue `SO4`. Treat as glycine
  warnings.warn("Unknown residue `%s`. Treat as glycine" % type)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py:42: UserWarning: Unknown value `SO4`
  warnings.warn("Unknown value `%s`" % x)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/protein.py:213: UserWarning: Unknown residue `PO4`. Treat as glycine
  warnings.warn("Unknown residue `%s`. Treat as glycine" % type)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py:42: UserWarning: Unknown value `PO4`
  warnings.warn("Unknown value `%s`" % x)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/protein.py:213: UserWarning: Unknown residue `BME`. Treat as glycine
  warnings.warn("Unknown residue `%s`. Treat as glycine" % type)
/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py:42: UserWarning: Unknown value `BME`
  warnings.warn("Unknown value `%s`" % x)

Constructing proteins from pdbs:   0%|          | 5/36635 [00:00<1:06:38,  9.16it/s]/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py:42: UserWarning: Unknown value `Fe`
  warnings.warn("Unknown value `%s`" % x)

Constructing proteins from pdbs:   0%|          | 5/36635 [00:00<1:10:20,  8.68it/s]
Traceback (most recent call last):
  File "/scratch/user/yuning.you/project/protein_cross_modal_pretraining/ProteinRepresentation/GearNet/script/downstream.py", line 56, in <module>
    dataset = core.Configurable.load_config_dict(cfg.dataset)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/core/core.py", line 269, in load_config_dict
    return cls(**new_config)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/core/core.py", line 288, in wrapper
    return init(self, *args, **kwargs)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/datasets/gene_ontology.py", line 72, in __init__
    self.load_pdbs(pdb_files, verbose=verbose, **kwargs)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/dataset.py", line 750, in load_pdbs
    protein = data.Protein.from_molecule(mol, **kwargs)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/utils/decorator.py", line 192, in wrapper
    return obj(*args, **kwargs)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/protein.py", line 185, in from_molecule
    protein = Molecule.from_molecule(mol, atom_feature=atom_feature, bond_feature=bond_feature,
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/utils/decorator.py", line 192, in wrapper
    return obj(*args, **kwargs)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/molecule.py", line 189, in from_molecule
    feature += func(atom)
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py", line 77, in atom_default
    onehot(atom.GetChiralTag(), chiral_tag_vocab) + \
  File "/scratch/user/yuning.you/.conda/envs/protein/lib/python3.9/site-packages/torchdrug/data/feature.py", line 47, in onehot
    raise ValueError("Unknown value `%s`. Available vocabulary is `%s`" % (x, vocab))
ValueError: Unknown value `CHI_SQUAREPLANAR`. Available vocabulary is `range(0, 4)`

Dear developers,

Thanks for your great work. When I am trying to have a quick run through fine-tuning, via python script/downstream.py -c ./config/downstream/EC/gearnet.yaml --gpus [0], the above error messages are returned before model training (for both EC and GO-BP). I would appreciate your time to help me resolve it.

Oxer11 commented 1 year ago

Hi! Thanks for reporting this bug and sorry for the inconvenience.

This bug is caused when processing atom features of proteins in the dataset, which won't be used in the model and should be disabled. I've fixed this problem in 09a4b92.

In essence, the problem is because TorchDrug only hasn't considered the chiral tag CHI_SQUAREPLANAR.