mosdef-hub / mbuild

A hierarchical, component based molecule builder
https://mbuild.mosdef.org
Other
171 stars 80 forks source link

Reorder bond, angle, and dihedral in write_lammpsdata #1071

Closed jaclark5 closed 1 year ago

jaclark5 commented 1 year ago

PR Summary:

Resolves #1070

I haven't fixed the tests to accept these changes because I expect there to be comments.

PR Checklist


daico007 commented 1 year ago

I think I am good with this change! I didn't realize the order was not enforced in the lammps writer.

codecov[bot] commented 1 year ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 90.54%. Comparing base (9d618c0) to head (c7a0099). Report is 132 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1071 +/- ## ========================================== + Coverage 90.49% 90.54% +0.04% ========================================== Files 61 61 Lines 6157 6186 +29 ========================================== + Hits 5572 5601 +29 Misses 585 585 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jaclark5 commented 1 year ago

The reported failures are due to a merger of the main branch... why doesn't this repository have a develop branch?

daico007 commented 1 year ago

I can look into the issue with the test (seems like it's stemming from something outside the scope of this PR). Regarding missing of the develop branch, we just don't feel the need to have one (at least up until now). For every new feature added, we usually try to make sure it's (kinda) ready for production (passed all the CIs test and don't interfere with other existing features, and hence, can be directly merged to main) but I am open to start having a develop branch if you think would make contributing to it easier (i.e., not having to worry about issue outside the scope of the PR), that way we can have things move faster and not having to worry about having a not-so-clean main. Do you have any opinion @CalCraven @chrisiacovella @bc118 ?

CalCraven commented 1 year ago

A develop branch is definitely a nice feature to have. The way I look at it, essentially the dev version of this package is similar to a dev branch, and in this case then the most recent minor release available on Conda is the main branch that has been fully tested and stabilized. However, there are definitely some added benefits to having a more three tiered system with the Conda release, a main branch, and then a dev branch, as noted by @jaclark5. Something to consider adding (and something we have added to GMSO where the majority of the MoSDeF additions have focused in the past year).

CalCraven commented 1 year ago

I think a way we could improve the sorting would be to replace a line like:

atom_types = list(set(atom.type for atom in structure.atoms))
atom_types.sort(key=natural_sort)
lx = len(atom_types)
angle_sets = [x[-1] for _, x in unique_angle_types.items()]
ordered_angle_sets = [
    (atom_types[x], atom_types[y])
    for x in range(lx)
    for y in range(x, lx)
    if (atom_types[x], atom_types[y]) in angle_sets
]

which scales at O(N^3)

with a keyed sort which scales on O(NlogN) as far as I'm aware.

atom_types = list(set(atom.type for atom in structure.atoms))
atom_types.sort(key=natural_sort)
angle_sets = [
    (x[2], x[-1][0], x[-1][1]) for _, x in unique_angle_types.items()
]
angle_sets.sort(key=lambda x: atom_types.index(x[0])) #improved sorting function
ordered_angle_sets = angle_sets
jaclark5 commented 1 year ago

Sure the sorting function is more efficient if there's a single quantity to sort by, but unless I'm reading it wrong, your proposed solution isn't sorting the second or third atom_types in the angle and so is insufficient. I can use the sorting function with something like this:

magnitude = np.ceil(np.log10(len(atom_types)))
angle_sets.sort(
    key=lambda x: atom_types.index(x[0]) * 10**(2*mangnitude) 
        + atom_types.index(x[1]) * 10**mangnitude 
        + atom_types.index(x[2]))
)
CalCraven commented 1 year ago

Sure the sorting function is more efficient if there's a single quantity to sort by, but unless I'm reading it wrong, your proposed solution isn't sorting the second or third atom_types in the angle and so is insufficient. I can use the sorting function with something like this:

That is correct, I think your solution here completes it!

We have discussed this further among the devs, and I tested out some 100 unique atomtypes 100,000 atom systems, and the writing isn't the holdup as compared to just atomtyping the system, so I think once the tests are passing here, we will be good to merge this. I'll also ping the original issue #1070 once this feature is implemented on the GMSO side of things as well.

jaclark5 commented 1 year ago

What is going on with the docs?

daico007 commented 1 year ago

Oh, I think the docs sometimes just doesn't build properly for PR. I will try to review this PR today. @CalCraven seems like most of your comments have been addressed too.

jaclark5 commented 1 year ago

I'm not seeing the output for LGTM, is that because the docs failed? It seems that there's an unknown attribute "spin"?

CalCraven commented 1 year ago

I believe the LGTM checks as of 22-12-16 have been discontinued and are implemented now in CodeQL. So I'm not too worried about that. I'm seeing an issue in read the docs for mbuild.coordinate_transform import translate_to as well as mbuild.coordinate_transform import spin and mbuild.coordinate_transform import rotate. These functions aren't touched in this PR, so it shouldn't be an issue here. It looks like maybe changes implemented in this PR affected them, and weren't caught due to other issues with the docs, so I think we can merge this as is, and address the docs issues in another PR @daico007.