Closed josan82 closed 4 years ago
Are there any other kwargs to consider in the preparation routines? Maybe we can use this PR to add some keywords exposed to the user. Can you link to the ADT docs so I can check? Thanks!
Otherwise, this looks good to me. I have to fix the Travis builds, and once we can test that, we are good to go.
[AD4ReceptorPreparation] (http://mgltools.scripps.edu/api/AutoDockTools/AutoDockTools.MoleculePreparation.AD4ReceptorPreparation-class.html) [AD4LigandPreparation] (http://mgltools.scripps.edu/api/AutoDockTools/AutoDockTools.Docking-pysrc.html/AutoDockTools.MoleculePreparation.AD4LigandPreparation-class.html)
I think that with these kwargs the molecules always will stay with the same structure as the pdb/mol2 files introduced by the user (no repairs and clean ups).
Let the user change these parameters could lead to unwanted effects combined with prepare_each=False. If prepare_each=True there is no problem, but if it's False then the current implementation of _update_pdbqt_coordinates expects the same atoms and order to actualize the coordinates without generating the pdbqt from scratch.
Are those repairs and cleanups required for the scoring function to work? I don't know if disabling permanently will incur in problems for some cases, like missing hydrogens and so on.
I think we should look into _update_coordinates
better... Current implementation is fragile and assumes too many things. The ADT package should contain functions to write the PDBQT correctly by providing some kind of object, so we should better cache that object, update the coordinates inside it, and then pass it to the hypothetical ADT writer? Let me know what you think.
If we set these parameters to don't make changes in the original molecules, of course it could lead to bad scoring if the molecules are not correctly prepared. Eventually, the clean-ups can also make variations in the scoring (not dramatically in the cases I've tested). I think that's the reason why ADT allows to parametrize these cleans-ups to permit some customization of the scoring depending on the nature of the system and the repairs could help in some cases of bad input files.
Then, my first thought, as yours, was to adapt the _update_coordinates, but I don't see an easy manner. I don't see an object in ADT that you can cache and modify the coordinates in it. But, truth to be said, I spent almost all yesterday's afternoon trying to figure out the reason of the bad scoring in my tests, so my head was not very clear.
As you say, the best way would be to adapt _update_coordinates and allow all the parametrization that ADT offers, but at least, this (temporary) modification ensures that what you put in the input pdb/mol2 files is what you get in the vina score.
A typical workflow without modificating _update_coordinates would be:
Can you rerun the tests in Travis please?
# edit your last commit, giving it a new time stamp and hash
# (you can just leave the message as it is)
git commit --amend
# push to github, overwriting your branch
git push -f
Tests are failing for gaudi.objectives.vina
because the ligand (extracted from 5ER1
) has wrong atom types that are not fixed in this PR (we are deliberately skipping those operations). A separate test case must be provided for that (properly configured protein/ligand, I'd say).
Note that in the practicum I give in my MSc lessons, it is often that vina breakdown because on non convenient atom typing. This can be easily corrected in the editing the pdbqt file though
JeanDI
Prof. Dr. Jean-Didier Maréchal Associate Professor
Insilichem Departament de Química Universitat Autònoma de Barcelona Edifici C.n. 08193 Cerdanyola (Barcelona) Tel: +34.935814936 e-mail: JeanDidier.Marechal@uab.es personal webpage: http://gent.uab.cat/jdidier insilichem webpage: http://www.inslichem.com
Le mar. 18 déc. 2018 à 14:40, Jaime Rodríguez-Guerra < notifications@github.com> a écrit :
Tests are failing for gaudi.objectives.vina because the ligand (extracted from 5ER1) has wrong atom types that are not fixed in this PR (we are deliberately skipping those operations). A separate test case must be provided for that (properly configured protein/ligand, I'd say).
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/insilichem/gaudi/pull/8#issuecomment-448224185, or mute the thread https://github.com/notifications/unsubscribe-auth/AP8jqLmn2_DAmWqnW12-WL51HZ0oBZhWks5u6PA8gaJpZM4ZSD8L .
To fix these errors, we have two strategies:
A) Provide a boolean flag repair
to enable/disable automatic reparations of the structures. It would be disabled by default (current behaviour of this PR), and users could enable it at their own risk. This is, adding atoms could mask errors when passing coordinates down to the PDBQT files. This is not desirable in my opinion.
B) Change the current tests so it uses an already amended structure, ready for use with Vina. These structures should have been prepared with AutoDockTools scripts (Prepare*.py
) and the resulting PDBQT files used directly in the Molecule genes. We should make sure that using PDBQT does not cause errors in other parts of the code (it shouldn't, but you never know...), so we better provide some tests for that too.
I'd go with option B, so the tasks list is:
docs/
(class docstring is enough).ValueError: Could not find atomic number for Lp Lp
and provide an informative error with a link to the corresponding part of the docs.Any updates or ETA?
Is this PR superseded by any of the recent ones?
When using prepare_each = False, the actualization of coordinates in method _update_pdbqt_coordinates is performed assuming that the order of atoms is the same in the chimera and pdqt molecules. That was not true, because AD4LigandPreparation and AD4ReceptorPreparation were making changes in the structure of the molecule in order to make some repairs (hydrogens) and clean ups. Now, these functions are called with a parameter configuration that avoids structure modifications of the original molecules.