igashov / DiffLinker

DiffLinker: Equivariant 3D-Conditional Diffusion Model for Molecular Linker Design
MIT License
301 stars 44 forks source link

cannot repeat the process of fragment generation #7

Closed MachineGUN001 closed 6 months ago

MachineGUN001 commented 1 year ago

hi,

thank you to provide so interesting and powful tool to generate linkers.

while I try to repeat your model, run into the error

the below command line I used to run the model om_difflinker_given_anchors.ckpt

!python -W ignore generate_with_protein.py --fragments fragments/frags_3hz1.sdf \
--protein proteins/pro_3hz1_protein.pdb --model models/geom_difflinker_given_anchors.ckpt \
--linker_size 3 --anchors 14,11

and the anchors were set as you mentioned in other issues. Snipaste_2023-08-02_15-32-24 but the error:

Will generate linkers with 3 atoms
[15:28:10] Can't kekulize mol.  Unkekulized atoms: 0 1 2 3 4 5 6 7 8

could you please provide some suggesitions about how to fix it and make it work? many thanks,

Sh-Y

FeilongWuHaa commented 1 year ago

It seems to be a problem with your sdf file(frags_3hz1.sdf), you should not export the sdf of small molecules directly from pymol, because the sdf exported by pymol often has errors, please use obabel or scorderinger to export the sdf of small molecules

MachineGUN001 commented 1 year ago

hi, FeilongWuHaa,

thank you so much to explain the issue. that can work after treating the molecule by maestro/scordinger .

however, I tried other molecules by follow the same process and command to generate the fragment.sdf. while running the generate_with_pocket.py script, the error happened as below.

Will generate linkers with 10 atoms
Traceback (most recent call last):
  File "generate_with_pocket.py", line 289, in <module>
    main(
  File "generate_with_pocket.py", line 197, in main
    pocket_one_hot.append(get_one_hot(atom_type, const.GEOM_ATOM2IDX))
  File "d:\Cheminfo_Workshop\4_Fragment_Scaffold_Evolution\DiffLinker\src\datasets.py", line 24, in get_one_hot
    one_hot[atoms_dict[atom]] = 1
KeyError: 'H'

the command I used is below

!python -W ignore generate_with_pocket.py \
    --fragments ./examples/fragments_test.sdf \
    --pocket ./examples/pocket_protein_a.pdb \
    --model models/pockets_difflinker_full.ckpt \
    --output ./examples/With_protein_pocket_1 \
    --linker_size 10 \
    --anchors 14,9 \
    --n_samples 20

also I tried other anchors for different attached points, but the same error occured, with providing KeyError,: "H"

could you please see this and suggest how to fix it up? many many thanks,

Sh-Y

MachineGUN001 commented 1 year ago

@FeilongWuHaa , appreciate your great help so much.

It seems to be a problem with your sdf file(frags_3hz1.sdf), you should not export the sdf of small molecules directly from pymol, because the sdf exported by pymol often has errors, please use obabel or scorderinger to export the sdf of small molecules I tried different molecules and also used schrodinger and openbabel's methods to deal with molecules. But it doesn't work every time. Some of the molecules, which are cut and left as fragments.sdf, are reporting errors similar to the following when running the script.

Will generate linkers with 10 atoms
Traceback (most recent call last):
  File "generate_with_pocket.py", line 289, in <module>
    main(
  File "generate_with_pocket.py", line 197, in main
    pocket_one_hot.append(get_one_hot(atom_type, const.GEOM_ATOM2IDX))
  File "d:\Cheminfo_Workshop\4_Fragment_Scaffold_Evolution\DiffLinker\src\datasets.py", line 24, in get_one_hot
    one_hot[atoms_dict[atom]] = 1
KeyError: 'H'

@igashov @HannesStark also, do you have any good suggestions for ligand extraction from proteins, and handling of cleavage into fragments? many many thanks,

Sh-Y

caiyingchun commented 10 months ago

I meet the same error:

Traceback (most recent call last):
  File "generate_with_pocket.py", line 301, in <module>
    max_batch_size=args.max_batch_size,
  File "generate_with_pocket.py", line 199, in main
    pocket_one_hot.append(get_one_hot(atom_type, const.GEOM_ATOM2IDX))
  File "/home/data/yingchun/mol_gen/DiffLinker/src/datasets.py", line 24, in get_one_hot
    one_hot[atoms_dict[atom]] = 1
KeyError: 'H'

I checked the code and found it is because that the 'H' was not defined in 29 line /src/const.py: GEOM_ATOM2IDX = {'C': 0, 'O': 1, 'N': 2, 'F': 3, 'S': 4, 'Cl': 5, 'Br': 6, 'I': 7, 'P': 8}. But I dont know how to fix it.

caiyingchun commented 10 months ago

Just adding 'H' to the dict may not fix it.

caiyingchun commented 10 months ago

I added option --backbone_atoms_only and it worked. But if neglected --backbone_atoms_only, it still didn't work,