mosdef-hub / mbuild

A hierarchical, component based molecule builder
https://mbuild.mosdef.org
Other
171 stars 80 forks source link

Add export to RDKIT to allow for chemdraw style visualization. #1137

Closed chrisiacovella closed 10 months ago

chrisiacovella commented 1 year ago

PR Summary:

This adds in a function to export to an RDKIT RWMol (See issue #1138). This will allow a user to display a chemdraw style version of an atomistic compound. To enable structures to be drawn with correct bond order, we can now store the bond order in the bond graph as a data attribute. If you initialize a compound with RDKIt (i.e., from smiles) it will set the bond order into the bond list. the add_bond routine now accepts bond order as an argument, so one could manually define a molecule with correct bond order as well. If bond order is not defined, the data will be set to 'default' (which to_rdkit will assume means bond order 1). I didn't want to set this to 1 so as to make it clear to a user that this value was not actually set during construction.

I've added in tests to check the conversion and ensure that bond order is properly propagated when, e.g., we clone a compound. The default behavior is not changed (e.g., calling bonds() will just return the pair unless return_bond_order is set to true), so this should not break any existing functionality.

PR Checklist


codecov[bot] commented 1 year ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (7142f41) 87.13% compared to head (0dd8fa6) 87.19%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1137 +/- ## ========================================== + Coverage 87.13% 87.19% +0.06% ========================================== Files 62 62 Lines 6457 6490 +33 ========================================== + Hits 5626 5659 +33 Misses 831 831 ``` | [Files](https://app.codecov.io/gh/mosdef-hub/mbuild/pull/1137?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mosdef-hub) | Coverage Δ | | |---|---|---| | [mbuild/compound.py](https://app.codecov.io/gh/mosdef-hub/mbuild/pull/1137?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mosdef-hub#diff-bWJ1aWxkL2NvbXBvdW5kLnB5) | `97.13% <100.00%> (+0.01%)` | :arrow_up: | | [mbuild/conversion.py](https://app.codecov.io/gh/mosdef-hub/mbuild/pull/1137?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mosdef-hub#diff-bWJ1aWxkL2NvbnZlcnNpb24ucHk=) | `89.80% <100.00%> (+0.39%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

chrisiacovella commented 1 year ago

I was just thinking that since force overlap can create a bond, I need to update that function to allow bond order to be set when calling force overlap to create a bond. Again, mBuild won't necessarily do anything with bond order at this point, but I think adding the ability to set this consistently.

Since this isn't very high priority, I think adding in a simple "infer_bond_order" could be useful; I tried to use the function in rdkit that supposedly does this, but ran into an error, so I still need to look more closely into that.

I haven't yet tried to convert a non-atomistic particle to rdkit. I feel as this is doable, since we will explicitly set the bonds (i.e., these aren't a function of atomic species or something) and in the function I'm overwriting the names anyway to include an index. Basically just need to add some logic to know what to do if element isn't set and the name can't be parsed into an element.

daico007 commented 1 year ago

I think we can just have the rdkit for atomistic system for now (initial check for particle element as you have now should be sufficient). The conversion to CG systems could follow later. I think the force_overlap will call the add_bond method itself, so for now it will be set as "default", but having the option to set the bond order when calling force_overlap would be nice (and not too hard to implement I believe).

chrisjonesBSU commented 11 months ago

@chrisiacovella I'm taking over this PR :)

I've addressed some of the comments from @daico007 and @CalCraven. I haven't looked closely yet at the unit tests, but I can work on adding a few more (and making sure the changes made aren't breaking the ones you added)

chrisiacovella commented 11 months ago

Thanks @chrisjonesBSU I'll take a look at this. Not going to lie...I thought we merged this a while ago!

chrisjonesBSU commented 11 months ago

Ok, I think this should be ready to go. I added a unit test for the missing lines in to_rdkit that checked for particle elements and threw an error if none are found. It seems like converting to rdkit should only be done for atomistic systems, so I added a note about that in to_rdkit in Compound

chrisiacovella commented 10 months ago

Thanks for finishing this off. Everything looks good. I think we are ready to merge.