SEDenmarkLab / molli_firstgen

In silico library generation tool box
4 stars 1 forks source link

Roche/Molli-firstgen merge #27

Closed ianrinehart closed 1 year ago

ianrinehart commented 1 year ago

molli firstgen + roche version

ianrinehart commented 1 year ago

@esalx I would like to clone the main repository after this pull request is finished. I was hoping to do that sooner rather than later if possible. Please let me know if there is anything that I can do to expedite this.

esalx commented 1 year ago

Sorry, this is long overdue... I'll let you know asap.

ianrinehart commented 1 year ago

OK, no problem. Thanks, man. I just wanted to make sure that it was still on your radar. Not trying to give you more things to do...I know that you are really busy.

ianrinehart commented 1 year ago

@esalx I made another change while working, and pushed it to my development branch. It is a minor change to the cdxml parsing script, and you may want to check it out to be sure that I am not introducing an unacceptable change.

There was a condition in there for an atom text being a hashtag "#" and being replaced with chlorine "Cl". This seemed oddly specific, and I commented that.

I also changed the way that atoms are parsed from the cdxml such that when an atom is instantiated as Atom(text, text, text), I manually replace "NH2" or "NH" with "N", and "OH" with "O". This is meant to catch cases where an explicit hydrogen on a heteroatom is present. If it is ok, I think it makes sense to assume that people need to add hydrogens with openbabel and do an initial minimization, since the cdxml are usually going to have implicit drawings that require it. This solution is pretty simple and is working on the molecules I'm using, but would fail with "SH" or other cases.

Just wanted to make you aware of that. [it is in parsing/cdxml.py]

ianrinehart commented 1 year ago

Adding a bit to the stereochemical hints section. In testing, a lot of hypervalent main group elements (like a sulfone or phosphine oxide) are not preoptimized effectively by xtb using either gfnff or gfn2, but with a dash or wedge effectively applied to one of the atom's ligands, then it works.

It's not perfect, but a case-by-case inclusion of the z-hints for these would make the system more robust. Sometimes, even with MD to find conformers, they aren't relaxing into their proper geometries.

I'm not sure if it's better to cancel this pull request and make more changes, then start a new one with a lot more changes, or to just finish this pull request with this version and wait to push my changes until afterwards, then initiate a second pull request later on.

ianrinehart commented 1 year ago

Hey, is it possible to finish this pull OR to make the branch public? I reference it as a dependency in my Lucid_Somnambulist package, but it would be ideal if we could have it be public and pulled into main. Just let me know if there is anything that I can do.

If we cannot make the branch public with the rest of the repo private, can I add the entire .zip as part of Lucid_Somnambulist for now, then update it later? Let me know what is preferable.

esalx commented 1 year ago

I made the repo public as promised, but I will also merge your contributions. Let me just quickly glance over them!