Closed QDelobel closed 2 years ago
Hey,
Is this submission on some cluster where you can specify memory resource requirements? Or is there a way to check available, ram and scratch space? One thing to try is to remove the keywords in input file that are controlling the memory resources. I try to have reasonable defaults so nobody has to deal with that. The larger resources available on the node its running computations on, the more default resources are allocated to the QM jobs. Another thing just to be careful about, is the input is PDB, thats okay sometimes. However, since there isnt explicit bond order in the PDB file, openbabel will "guess" what it thinks the bond order should be. In some cases that will make something you might not want, so just safer to use a file with bond order explicit. I also notice there are no hydrogens on the sugar oxygens (not sure which protonation state you are going for). Im running on our cluster to see if QM still crash, its using PCM by default since there are phosphate groups (highly charged) and sometimes hydrogens can transfer with just gas phase QM geometry optimization.
Yes, i run it on a cluster so that's why i specified the number of core and maximum memory used because when I didn't it launched itself with the totality of the ressource of the core (12 CPUs and 52GB) with could be harmful for other jobs. For scratch space, I don't think it should be a problem.
Okay I will let you know if and when the QM converges.
I added hydrogens to sugar and optimization converges fine. I think a charge of -7 makes things particularly difficult, psi4 geometric,optking and even gaussian optimizers all failed...
Hello, Yes it did the same for me, psi4 terminated normally at first with the optimization complete but then an error came out on my output file and my poltype.log generate from the cluster in which i do the calcul about some sort of Value error because of 'bond created after optimization' or 'too many bad steps'. I even tried with a SDF file of my ligand and it did the same errors. And I think the -7 of charge came from the missing hydrogen that was added because their wasn't any charge in the input file. output-poltype.zip e
Hey,
I did not get this error. This error you see is usually caused when hydrogens move during optimization or if someone changes structure in same folder without deleting the right files I think can cause too. PCM is used automatically to prevent hydrogens from moving. The PCM may not always be perfect but we also restrain dihedrals in extended conformation which I hope would also prevent some of the floppy parts of molecule getting to close where hydrogens have an easier time to migrate. Was this done in a separate folder?
I just gave you the code in my submit-file when I launch the calcul on my cluster. I didn't touch the structure in the folder when its launching so I don't think its a problem of separate folder. submit-file.sh.zip
https://utexas.box.com/s/7udhp6qrai3ihslg1vsx86vgg8usffic here is FBP, I dont understand what is the issue when you were running I can look at input files. Just adding hydrogens to sugar.
Okay the input sdf file still doesnt have hydrogens attachted to the sugar (which crash for me too). Its very highly charged and will cause problems.
Oh, i see. I'm a little new to the dynamic subject and parametrization ,my phd is a codirection cancerology-virology and molecular dynamics and I studied more on biology and virus. So just a naive question but how exaclty did you specify the coordinates to add the hydrogens on the sugar ?
From the other one, my guess by reading error in qm log files is the new geomeTRIC optimzation engine doesnt play well with PCM in psi4. The newer commit has a "try new qm opt algorithm but if fail try old psi4 default". Ill see if that works.
Ah okay, there are lots of ways to add but I like to open the SDF file in pymol and just click on each oxygen atom you want to add, then on right hand side click add hydrogens.
Okay I was right, it finished in minute cause molecule is so small
You can run with a new commit and that should fix issue with PCM failing on psi4's new optimizer we added.
Ok thank you, how do I procede to run with this new commit for my second small ligand ?
I also did try to run my FBP ligand like the one that you did that finished but I had an error on the poltype.log about some segmentation fault if you had an idea what the problem could be.
It seems to be like the other issue that someone had : https://github.com/TinkerTools/poltype2/issues/11
This happens sometimes when not enough memory. Can also try just rerunning again to see if its reproducible (since I know it completed for me).
Just for information, how much memory did you use ? Ok I just redid it 2 times and it resulted with the same segmentation error like the zip file that i gave you yesterday.
Can we do a simpler test on like a water molecule. This can rule out issues like conda installation. Also might be issue with scratch space writing (not enough space or where trying to write doesn’t have right permissions etc). If that works then whatever resources input you used to try, try doubling everything etc…
I tried to use the same water sdf than the one in the example file (I copied the water.sdf in another file) but it sent me 2 minutes after executing poltype a termination signal in poledit.x like this in the poltype.log Water.zip
While I currently relaunch my OXL ligand more small since hours and it didn't crash 3srd_E_OXL.sdf.zip
so I don't know what happened with water
Seems like it may be some tinker installation issue.
I copied the poledit command in poltype.log file and replaced the prm file with the one in my filesystem (poledit 1 water_3D.gdmaout /home/bdw2292/poltype2/ParameterFiles/amoebabio18_header.prm < water_3D-peditin.txt) then it works fine. Can you run some other tinker programs? Im guessing analyze at least works because that is used for checking tinker version at beginning of poltype.
For your other molecule, just be careful about hydrogens. Where did the SDF file come from?
I did analyze and poledit.x and it seems to work for these two
The SDF File for OXL came from the structure corresponding in the PDB website https://www.rcsb.org/structure/3srd (I took the OXL SDF in the Download Instance Coordinates inside small molecules section)
Can you try copying the poledit command from poltype log file and run that manually in your terminal?
Okay, I guess if they dont have information about which protonation state it should be in you can try using https://docs.chemaxon.com/display/docs/pka-plugin.md . Also poltype uses a paper from the Durrant-Lab for ionization state prediction at pH~7. I think the results are comparable to chemaxon (but havent tested large scale). The paper (dimorphite-DL) has more details about accuracy.
Ah i see where is the problem, when I launch the test in my cluster, it takes a nodes free but the one chosen for the test of water show an error with all the commands. But I wouldn't know exaclty why.
I tested different tinker.x command on the node '13' (the one where the water test did go) and all failed but if I did it on another like the '80' it works normally
I have added this warning "/home/bdw2292/poltype2/PoltypeModules/poltype.py:3546: UserWarning: No hydrogens detected in input file!" it will go out in next commit.
Sounds like some of your nodes may have different tinker installations maybe
Ah and for the pKa plugin in chemaxon do you have some information to use/download it to see more precisely the state of my ligands ? I looked a little bit and didn't find any information about some installation
I think it may require a chemaxon license, there is some documentation on how to use pka plugin. Alternatively, poltype outputs IonizationState files when running on any molecule. What you have may be correct for carboxlic acid (there are two of them). If pka is 5 and ph is 7 then I guess should be deprotonated.
Hello, I did the test with Water on a good node and it generate final.xyz and final.key so I think it worked if you have the time to take a look.
I will try my ligand with more memory since it seems to be the origin of the problem
Hello, I did continue to test poltype with more memory but I encountered an issue in the poltype.log file where it seems to run indefinitely some sort of displacement. If you have some time do you have any idea where could the problem be since it did work normally for water (last message) ?
I am also trying to do my bigger ligand FBP (with the hydrogens like you said) but it seems to follow the same path but slower since its more big (it's not finished so maybe it will be good) and for your test on FBP you did obtaines the xyz/key.
Thank you for your answer
Hello,
So for the displacements this is an output from Psi4s PCM optimizer (for geometry optimization) that gets piped to stdout (which I pipe to poltype.log). Just be careful about input protonation state, I kinda quickly only added hydrogens to your sugar for testing (but the carbons still didnt have hydrogens added), so you can see in the log file the hydrogens on some of your carbons were added automatically! Previously I only add automatically for carbons missing one hydrogen but with new commit now it will also add hydrogens to carbons missing two bonds (or valence <4). Although the protonation state issue may be independent of any psi4 problems. PCM optimizer in psi4 by default (used when we have concentrated charge like phosphate in your molecule to prevent hydrogens from jumping around to other atoms) is not super great and apparently isnt compatible with our new geomETRIC psi4 optimizer. As for the sulfate you sent me mine finished fine, maybe not enough memory inputs? The program ended up using 20 GB of RAM and 8 threads by default.
Hello, thank you for your answers, so I did run with a FBP with all missing hydrogens but it did have an error in the poltype.log ( about missing option it seemed ?), here are the files generated and just a screen capture of the error at the end
If you have some insight about this error, thank you for your time. And if possible could you sent me your result with the sulfate so I can see it ?
How much scratch space do you have available on your submission node?
I have 3,6Tb of available on the node where this calcul was done
Here is the SO4 finished https://utexas.box.com/s/wlkxm2cokharx7hnwas7ja0rru2xbdj4
I guess one thing to try is resubmit psi4 command manually in your terminal and try to reproduce issue. I am running on our cluster. There are other things to try playing with like lowering basis set/increasing input resources etc (only for testing/debugging we want default settings for actual parameter derivation).
Here is everything up to electrostatic potential fitting step for the molecule with two phosphates and sugar (it finished the QM there on my end). https://utexas.box.com/s/yqalmsmnxupm04mwibwm32lgyjhzcxpq
Thank you for your logs on your side, i did find that mine finished suddenly after generating the grid with the error showed on my New-esp.log file (from my FBP.zip of the last message) and the only difference I did see was on the memory so I will try with more in regards to my available ressource.
In regard to lowering the basis-set, how exactly could I do that ?
Hello,
I think it was a problem of scratch space because it wasn't indicated clearly but when I submitted my job on my cluster, i discovered on the output file generated that some limit of file was made at 20GB so I did retry with 100GB and it finished my FBP ligand with final.key + final.xyz. I just sent you the drive with files (normally you should have access) generated if on your side everything seem ok ?
https://drive.google.com/drive/folders/1izO-RJlWv0y931ACmbkoiS4jY4o8QnN8?usp=sharing
Glad it worked out. Your torsion profile plots look very reasonable (see OPENME) folder. Just let me know if the instructions on the readme page make/dont make sense. Just want red curve (amoeba) to match blue curve (QM) (https://github.com/TinkerTools/poltype2/blob/master/README/README_OUTPUT.MD#openme-plots) accessed by hyperlink from main readme file (README.MD).
Hi, Thanks for your help,
for now most of my test are finishing so it seemed to have been resolved (and my curves are matching well) but I had a test with this small ligand that didn't finished and the cause seem to be from missing QM SP energy value for torsion during QM optimization. Could it be possible for you to test on your side if you obtained the same error as me and if you have any idea what is causing this error ?
No problem. This is one you already sent me, im not sure what went wrong with psi4 on your end. Can try manually reproducing by copying command in log file and pasting in terminal as a start. Here is latest.
Are you talking about the command line which calls : Calling: psi4 3srd_E_OXL-opt_1.psi4 3srd_E_OXL-opt_1.log path = /scratch/qdelobel/845783.master.lct.jussieu.fr/Temp and that give me the long displacements calculation and the error ?
Also i did find that for your test, the poltype log didn't show any step of QM optimization contrary to mine at the beginning. Could it be possible to know which version of Psi4/Gaussian you are running on your cluster if it's linked ?
FINDIF
R. A. King and Jonathon Misiewicz
----------------------------------------------------------
Using finite-differences of energies to determine gradients. Generating geometries for use with 3-point formula. Displacement size will be 5.00e-03. Number of atoms is 6. Number of symmetric SALCs is 12. Translations projected? 1. Rotations projected? 1. Number of geometries (including reference) is 25. 25 displacements needed ...
//>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>// // FiniteDifference Computations // //<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<//
Just in case the psi4 version that is running on my cluster is 1.3.2, could it cause a problem since you run it at a version 1.6.1 ?
Possibly, let me know if that one works then I will "freeze" the psi4 package on our yml file if that turns out to be the case.
Hello, So i did retry the test and it still blocked at the qm optimization step but what I found is that some of my ligand did pass this QM optimization (like FBP/PYR or GOL) but those with no hydrogen and some high charge like SO4 and OXL (i even used pymol but no hydrogen was needed for those small ions) didn't passed it and try indefinitely or crashed after that step, right now my PO4 ion is in the same state where its currently trying to optimize but doesn't seem to work.
I did sent all the file for each ligand i talked about so maybe its not the psi4 version but just these small molecule that are the problem ? I could also retry those small ones that failed with a new commit since the one that i have is from two weeks ago ? https://drive.google.com/drive/folders/1izO-RJlWv0y931ACmbkoiS4jY4o8QnN8?usp=sharing
I also have a question after i finalized the parametrization of this ligand FBP, I need to transform my pdb structure to xyz format with tinker but how could I use this key generated with my force field parameters file so that my ligand that bind to my protein is also converted from pdb to xyz. I did try by myself to copy the key generated in my amoebabio18.prm but even with pdbxyz this ligand wasn't convert with the protein. And I have this same ligand 4 times on my structure (since it's a tetramer), should I used poltype for all 4 or normally it should work with one since they all have the same binding site ?
At least for the cases where psi4 works for me but not on the other end it’s definitely psi4 issue (input mem resources or version etc..). The rule of thumb though is if it’s highly charged and we use PCM (to prevent any hydrogens from moving) then psi4 optimization algorithm will take longer and might have more convergence issues. With last commit if enough psi4 torsion optimizations fail it tries with Xtb (which doesn’t have pcm feature etc, so idk if we will run into case where hydrogens move or not yet).
For generating the tinker xyz, the main thing is we have a tool called pdbxyz that works by recognizing pdb topology and mapping to type number in prm file. Pdbxyz doesn’t recognize ligand topology and so doesn’t know how to map ligand atom index in pdb to type in ligand xyz file. If the atom order of ligand in xyz and pdb are the same then this is much simpler to manually go through your xyz made from using pdbxyz and add correct ligand types yourself. If not then need some topology/smiles matching. Poltype can do this for you (see example on main page where input complex ligand and protein pdb is given). Just need to also give ligandxyzfilenamelist and keyfilenamelist as inputs also. You can add an additional keyword “boxonly” to quit program after making boxes. I can add another one like “xyzonly” to quit after making the tinker xyz file with protein and ligands complexed.
Thank you for all those informations, maybe i will try to get the last commit when some of my current parametrization that are working are complete since the problems occurs on systems that don't have hydrogens. So for poltype to do the protein-ligand xyz for me i would need to create a new poltype.ini with these parameters :
uncomplexedproteinpdbname='name'.pdb complexedproteinpdbname='name'.pdb binding keyfilenamelist=FBP.key ligandxyzfilenamelist=FBP.xyz
and just run the same poltype command line if I understood correctly ?
Also yes for my case it would be interesting to quit after the tinker xyz file with prot-ligand is formed if it's not too much work ?
Hello,
I'm a doctorate student who tries to parametrize one ligand using your software but I'm currently running into a wall about a problem that seems to be memory allocation : MemoryError: std::bad_alloc
I'm attaching the files generated after the test if you had some time to help me resolve this issue. Poltype-FBP.zip
Thank you for your attention.