Closed IAlibay closed 1 week ago
I think this is a great idea - precisely how each atom got its charges assigned is a common pain point for users, and one that's likely to grow if we release force fields that support other charge methods than the current Sage line (AM1-BCC everywhere, except simple library charges for water and monoatomic ions). I fully agree this is a subpar user experience
Agree that there should be a way to see how charges were assigned. Since this seems like it'll be a load-bearing feature, do we want to go deeper than logging to INFO? The partial charge source could be written to the output Molecule/Topology/Interchange objects as a more general solution.
The partial charge source could be written to the output Molecule/Topology/Interchange objects as a more general solution.
If that's possible, from a developer standpoint, that would be great.
Logging would be good for your base users that probably have a script lying around to create gromacs/amber input files (edit: I'm not sure how well I'm explaining this, can expand if necessary).
Jeff and I chatted about this briefly and decided to initially try the logging approach because it should be significantly more straightforward to implement but also provide downstream users with enough information to reason through how charges were assigned.
A separate solution might involve storing full provenance on the ElectrostaticsCollection
itself. This would be significantly richer in data but more difficult to implement and potentially bloat the memory/disk size of the resulting models. If straightforward logging shapes up nicely, I don't think I'm going to further explore this idea. But it might be worth prototyping to see if the actual thing is different than I expect.
Expect this in 0.4.1, unknown date
This is a sizable feature addition that intersects with some of the most complicated parts of our products, so, even though I wrote more tests than about any feature I've ever written, I expect some rough edges when this is rolled out. I'll happily take feedback on how this could be improved, everything from obvious bugs to ergonomics.
https://github.com/openforcefield/openff-interchange/issues/1088 is the only major thing not well-supported right now. I think finding ways to consolidate logging of large molecules is another easy avenue for improvement, I just deferred it originally.
Context
Follow-up from https://github.com/openforcefield/standards/issues/68 and some DMs between @mattwthompson and I.
What currently happens
When creating an interchange object, charges will be assigned to Molecules based on the charge assignemnt hiearchy. However, it not possible to easily know which molecules were assigned charges through which method - something which users would like to know as it could affect the quality of their results.
The ask
The base request here would be add logger messages at the INFO level during
Interchange.from_smirnoff
creation which tells us what how partial charges were assigned for each unique molecule.Alternative
For now I'm just checking what has library charges manually (outside for the interchange creation step), but that makes for a difficult developer experience.