IDEA-XL / InstructMol

InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery
https://idea-xl.github.io/InstructMol/
Apache License 2.0
38 stars 6 forks source link

Issue with OGB Version and AtomEncoder in MoleculeSTM checkpoint Compatibility #3

Closed Syzseisus closed 1 week ago

Syzseisus commented 3 months ago

Dear authors,

Thanks for the exciting work, but we have identified a potential inconsistency related to the OGB version specified in requirements.txt and the provided pre-trained weights.

Details:

  1. The requirements.txt file specifies ogb==1.3.6.
  2. You used ogb.graphpropropred.mol_encoder.AtomEncoder, which utilizes the ogb.utils.features.get_atom_feature_dims function to create full_atom_feature_dims for generating the atom_embedding_list of AtomEncoder.
  3. The get_atom_feature_dims function relies on the ogb.utils.features.allowable_features variable.
  4. In ogb==1.3.5, allowable_features["possible_chirality_list"] includes the following four values ref:
    • "CHI_UNSPECIFIED"
    • "CHI_TETRAHEDRAL_CW"
    • "CHI_TETRAHEDRAL_CCW"
    • "CHI_OTHER"
  5. In ogb==1.3.6, an additional value, "misc", is included, making it five values ref.
  6. The shape of the weights provided in your README is [4, 300], suggesting that they were generated using ogb==1.3.5.
  7. But when AtomEncoder is defined in ogb==1.3.5 along the requirements.txt, the shape of the weights become [5,300].
  8. Therefore, there seems to be a discrepancy where either the requirements.txt file is incorrect, or the checkpoint provided is based on an older version of OGB. (please refer to the attached picture below.)

image

Request:

Could you please confirm and provide the correct information regarding the OGB version and the corresponding checkpoint? This clarification would greatly assist in ensuring the proper functioning of the model with the appropriate OGB version.

Thank you for your attention to this matter. Looking forward to your response.

Best regards,

Syzseisus

p.s. I'm really looking forward to the issue #1.

CiaoHe commented 1 week ago

Sry for the confusion. We indeed use ogb==1.3.6, the discrepancy between the ogb versions annoys me (the original moleculestm repo uses a very old version: 1.2.0). So, one way to solve this:

  1. install ogb==1.3.6
  2. go to xxxx/envs/ENV-NAME/lib/python3.9/site-packages/ogb/utils/features.py and manually change the list of
    'possible_chirality_list' : [
        'CHI_UNSPECIFIED',
        'CHI_TETRAHEDRAL_CW',
        'CHI_TETRAHEDRAL_CCW',
        'CHI_OTHER',
        # 'misc' # just comment it
    ], 
Syzseisus commented 1 week ago

Thanks :)