igashov / DiffLinker

DiffLinker: Equivariant 3D-Conditional Diffusion Model for Molecular Linker Design
MIT License
288 stars 40 forks source link

Larger segments as input #13

Open ivandon15 opened 1 month ago

ivandon15 commented 1 month ago

Hi DiffLinker Team,

Thank you for your great effort on this. And I was wondering if the model is possible for sampling linkers for larger molecules (like connecting 2 10-mer peptides). I tried using existing model (I know it's not appropriate, I just want to check if the model can run without error) , the model raised NanError during sample_p_zs_given_zt_only_linker. Is it because the inputs contains too many atoms?

Thank you for you patient and help!

ivandon15 commented 1 month ago

I might have found the solution. The DDPM class calls the GCL class, which uses the unsorted_segment_sum function to process node features. Within this function, there is a normalization_factor. The default value for normalization_factor is set to 100, likely because the original model was applied to small molecules (is that right?). However, when I use peptides as input, the large number of atoms causes an explosion in the second step of diffusion. When I tried setting the normalization_factor to a larger int the model no longer produced NaN errors. Right?