arneschneuing / DiffSBDD

A Euclidean diffusion model for structure-based drug design.
MIT License
339 stars 74 forks source link

Error occurs at Line 146 of get_edges() #28

Open mathfirst opened 1 year ago

mathfirst commented 1 year ago

When I use your full_atom.ckpt to sample for the protein pockets in the test set of CrossDocked dataset, the following error occured.

File "DiffSBDD/equivariant_diffusion/dynamics.py", line 146, in get_edges
    edges = torch.stack(torch.where(adj), dim=0)
RuntimeError: nonzero is not supported for tensors with more than INT_MAX elements,   file a support request

Could you please fix it? Is the checkpoint for CrossDocked dataset available? Hopefully, it is trained on its training set.

mathfirst commented 1 year ago

I tried to solve this problem with radius_graph which is much faster than your original get_edges, but the subsequent sampling result showed that radius graph may not be consistent with the result obtained by your original code.

from torch_cluster import radius_graph
arneschneuing commented 1 year ago

Thanks for this suggestion! We initially tried to minimize dependencies but if radius_graph is much faster, we should consider using it in the future. Note that we don't remove self-loops in get_edges. If you set loop=True in radius_graph, the behavior should be more similar to the original function.

mathfirst commented 1 year ago

Yes, I was aware of loop=True. The problem is when I used radius_graph, another error occured as follows.

ERROR: Element '0 F' not found
ERROR: moving to the beginning of the next molecule
File: DiffSBDD/analysis/molecule_builder.py", line 168, in process_molecule
mol = Chem.Mol(rdmol)
Boost.Python.ArgumentError: Python argument types in
    Mol.__init__(Mol, NoneType)
did not match C++ signature:
    __init__(_object*, RDKit::ROMol mol, bool quickCopy=False, int confId=-1)
    __init__(_object*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > pklString, unsigned int propertyFlags)
    __init__(_object*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > pklString)
    __init__(_object*)

So I had to give up radius_graph. Finally, I opt for np.where as a bypass.

adj_np = np.where(adj.cpu())
edges = torch.stack((torch.from_numpy(adj_np[0]).to(adj.device), torch.from_numpy(adj_np[1]).to(adj.device)), dim=0)

The downside is that np.where is very slow for a large problem.

arneschneuing commented 1 year ago

The new error appears to be unrelated to the old one. Are you sure it was caused by radius_graph?

mathfirst commented 1 year ago

Honestly, your code is very hard to run, because it needs a lot of memory. Even though I found a great GPU with large memory and trained your code for about 6 days, the sampling results are much worse than your first draft. So, not mention to the second manuscript. I am sorry to say that your reported results are not that convincing. Could you please make your code easy to run such that other researchers can test your results. By the way, I ran your code on CrossDocked dataset with full condition.