Open miquelduranfrigola opened 5 months ago
Hi Miquel,
The fastest way to work around the seed issue is to slightly modify your molecule to include a suitable seed. The "extra" atoms could then be manually deleted after generating analogues.
To illustrate this, the following molecules are examples are slight modifications of your molecule that would enable a seed to be found: CC1(C(C2=CC3=C(C=CN=C3)C=C2)=C)CC=C4C=C5C(O)C(O)C(N(CCC)C)CC56CCC4(O6)C1 CC12CC=C3C=C4C(OCC)C(O)C(N(C)C)CC45CCC3(O5)C1CC=C2C6=CC7=C(C=CN=C7)C=C6 CC12CC=C3C=C4C(O)C(O)C(N(C)C)CC45CCC3(O5)C1CC=C2C6=CC7=C(C(CCC)=CN=C7)C=C6
However, there is a bigger issue with your molecule: the large 19-membered ring structure is not in SQUID's fragment library. Because the encoder concatenates embeddings of the fragments to the atom embeddings, SQUID won't be able to directly encode this molecule. You could possibly try to hack your way around this by only encoding the shape point cloud, and sampling the atom embeddings from the variational priors (e.g., lambda = 1.0) so that the model loses all information about the excluded fragment. Note that I never attempted this idea, so it would be a bit experimental.
Hello @keiradams this is extremely useful. Thank you so much. I'll give it a try. Congrats again on a great tool
Hi,
Thanks for a great repository.
While trying to run the function
get_starting_seeds
on moleculesCC12CC=C3C=C4C(C(C(CC45CCC3(C1CC=C2C6=CC7=C(C=C6)C=CN=C7)O5)N(C)C)O)O
, I am unfortunately unable to get any seed.Is there any workaround? I would really like to apply SQUID on my query molecule but unfortunately I can't since no seeds are found.
Thanks a lot in advance!