Constrained_embed_dualmol function discarded all generated conformers

YunpengCai commented 2 years ago

Thank very much for sharing your great work. I've got a question when reading the codes and hope you can explain the logic behind.

When embed it to the reference pose, constrained_embed_dualmol() function was used to generate an embedding of a dual-topology pseudomolecule to target using MCS. You commented 'Prepare a temporary copy of molecules A and B, but remove original conformers to make sure we don't get one of them in the rms_dict' .

My question is: Why remove all the conformers of molecule A and B and replace them with newly generated conformers?

In my case, the input target is often molecule A itself and has already be aligned in align_ligands function during superimpose step. Therefore, keep the superimposed poses_mol_data would be a better option for me, especially num_conformers=50 used here is less than num_conformers=200 used in align_ligands step, which result in worse alignment with higher volume function values sometimes.

luancarvalhomartins commented 2 years ago

Thank you very much for your message and your interest in PyAutoFEP.

IIRC, I originally did that because I observed that sometimes the input conformation of the A molecule was such that the RDKit conformer generation was not able to obtain matching conformations for B molecule when calling find_mcs_3d at line 502. But later I moved the conformer removal to find_mcs_3d itself. I think it should be safe and a great idea to keep conformations. Furthermore, constrained_embed_shapeselect should test for the equality of the MCS to further speed up the use case you described. I am adding this to a future commit (unfortunately, I am quite busy this week, so I am not sure if I will be able to to that before, say Aug 1st).

Something that related to this situation is when the user supplies molecules which are already core-aligned to each other, so that the input conformations could be used as is. I understand that this is not your case. Anyway, it would be a great addition.

[ ] Keep conformations at merge_topologies.constrained_embed_dualmol:460 and merge_topologies.constrained_embed_dualmol:465.
[ ] Test for MCS equals to whole molecules at merge_topologies.constrained_embed_shapeselect:225, if so, copy coordinates (make sure the handle the hydrogens).
[ ] Add a user option to constrained_embed_dualmol volume_function so that the input conformations are directly copied to the dual molecule.

YunpengCai commented 2 years ago

Thanks very much for your answer. It is much clearer to me now why conformers were removed.

I believe the enhanced new features you plan to add would be beneficial to users who have aligned their poses carefully with their own methods.

BTW: Do you have any plan to improve the alignment method in future? Because some of my test cases still cannot get a good shape match after searching conformers by 200 iterations using constrained_embed_shapeselect. To be more specific, the best_solution = rms_list.index(min(rms_list)) still have a large distance between the ligand and target.

luancarvalhomartins commented 2 years ago

I have also experienced problems with the constrained_embed_shapeselect in a very similar fashion. It seems that sometimes the conformations more similar to the target are just not generated by RDKit conformer generation algorithm. Unfortunately, in the current core version there is not much to do in these cases: even asking for num_conformers won't help.

I do plan to write a new conformer generation, probably a exhaustive one, at some point, but honestly it is something I will not be able to work on in the near future. In case you or someone in your lab is interested in implementing this code, I would love helping. Otherwise, unfortunately, it will take some time.

caiyingchun commented 1 year ago

Thank you very much for your great work. I have some questions for the code.

constrained_embed_dualmol() function will remove the input conformations and generate new conformations for input ligands. We may not need to development a new method to do conformation generation and alignment. If we just want to keep the input conformations, how could we do to comment some code and make this function disfunction.
How much influence will different initial conformation have on the FEP results? We usually use docking conformations as the initial conformations. I noticed that our input docking conformations are much close to the new genertated and aligned conformations. So I wonder how different they will be to the FEP results.

luancarvalhomartins commented 1 year ago

Dear @caiyingchun, Thank you very much for your interest in PyAutoFEP.

Yes, this is totally doable and very useful. It was in my todo list at some point, but I never implemented it. Hope I can add it in the next few days.
I would expect that the effect will depend a lot on the specific ligand, system, and conformations. Most likely the larger difference will be observed when the docking and superposed poses are less similar, perhaps trapped in different local minima. Also, larger perturbations should amplify this effect. I would say that for small perturbations and somewhat rigid molecules the effect are likely to be very small. I am not aware of papers directly comparing wrong and correct binding poses, but take a look at https://pubs.acs.org/doi/10.1021/acsomega.8b00123 and https://www.nature.com/articles/s41586-022-05258-z.

luancarvalhomartins / PyAutoFEP

Constrained_embed_dualmol function discarded all generated conformers #58