openforcefield / openff-interchange

A project (and object) for storing, manipulating, and converting molecular mechanics data.
https://docs.openforcefield.org/projects/interchange
MIT License
71 stars 22 forks source link

Unique atom types in LAMMPS. #1015

Closed jacob-votava closed 3 months ago

jacob-votava commented 3 months ago

I am trying to convert an interchange object into a lammps data file with unique atom types. I know this is possible with to_gromacs() but it doesn't seem like it's possible with to_lammps(). My work-around has been using to_gromacs() and then using intermol to convert it to a lammps file, however, I've noticed some bugs with intermol related to balancing charges and improper dihedrals, so this is not ideal.

Is there any way to assign unique atom types to lammps data files, or even to assign lammps atom types to specified atoms?

Thank you so much for reading!

mattwthompson commented 3 months ago

Could you elaborate a bit on what you're after? I don't know what you mean by "unique atom types" here, atom types should be plenty unique as written to the non-bonded sections.

Note that Interchanges from SMIRNOFF sources don't have atom types so mapping information into atom types as needed by LAMMPS, etc. is handled purely in this package.

jacob-votava commented 3 months ago

Hi! Sorry for the vagueness. To word it better, I mean atom types as defined by LAMMPS. Currently I am trying to simulate mesogens that contain 60-70 atoms, Interchange defines 10 atom "types" in LAMMPS. However, when I use to_gromacs() it defines 60-70 atom 'types'.

My analysis scripts define a director (vector along the long axis of a mesogen) based on the LAMMPS atom types. So, what I've been doing is generating a gromacs input file and then converting it to LAMMPS using intermol to get a lammps input file where each atom has it's own 'atom type' in LAMMPS. The to_gromacs() method seems to have a _merge_atom_types flag, whereas to_lammps appears to do it automatically. I'm sorry to bother you with this!

mattwthompson commented 3 months ago

In this case, are you trying to get more (~70) or fewer (10) atom types? Usually we run into problems with too many atom types, so I'm surprised to see somebody ask for the opposite! I have to question, a little bit, if atom types are the best way for you to track individual atoms in a molecule. Generally this sounds like what atom indices are used for, and OpenFF's infrastructure has partial support for respecting atom indices. If this doesn't make sense, I'm happy to take a look at some code you're working with and see if we're on the same page.

But this all makes sense to me now; I remember there being some toggleable behavior in the GROMACS files but it didn't occur to me that only one of the options is implemented with LAMMPS.

jacob-votava commented 3 months ago

I see! I meant getting more atom types, which, admittedly, may not be the best idea. I've set up my analysis scripts to select vectors using MDAnaylsis, for example to select my directors:

def calculate_frame_directors(u,director):

    Return array of normalized vectors between atoms of type director in frame, accounting for periodic boundary conditions.
    Assumes that the director is unique in each mesogen.
    Parameters:
    -----------
    u: MDAnalysis Universe object
    director: list of ints, representing the atom type of the director in the molecule.

    Returns:
    --------
    vectors: np.array
        Array of vectors between atoms of type director in frame, accounting for periodic boundary conditions.

    # Select atoms of type director into arrays atm1s atm2s.
    atm1s = u.select_atoms(f'type {director[0]}')
    atm2s = u.select_atoms(f'type {director[1]}')
    # Calculate vectors between atoms of type director in frame, accounting for periodic boundary conditions.
    vectors = atm1s.positions - atm2s.positions
    Lbox = u.dimensions[:3]
    vectors = vectors - np.rint(vectors/Lbox)*Lbox

    # Normalize vectors
    vectors = vectors/np.linalg.norm(vectors,axis=1)[:,np.newaxis]

    return vectors

It's definitely not the most efficient/ideal way to do it, but I designed the script assuming that in the future I would be mixing mesogens with slight changes to their chains, polymerizing them, etc. so it would be easiest to select based on atom type. (Since the other force fields I used assigned a unique atom type to each atom I defined in the unit vector of the mesogen.

I'm sorry to hassle you, but how would you recommend implementing this with the openFF style?

Again, thank you so much for your help with all of this, and for your responses.

mrshirts commented 3 months ago

I would PROBABLY go away from atom types, since that would require always preserving the same type of parameterization engine and name. Changes in the chemistry 1-2 atoms away can change the atom type. I would go to atom ID's selected manually in a lookup table.

I'm assuming you are doing this to determine phases of liquid crystals, and it's not 100% required that the vectors be exactly the same between mesogens - each time you pick a new molecule, the exact structure of the aligned phases will change - so it's more about having a reasonable vector so that one can identify the difference in alignment between the phases.

To do something semiautomated, it probably would be better to do something like principle component analysis on a set of configurations for each molecule and choose two atoms IDs that are the closest to being along the major axis, or even instantaneously picking the vector by doing PCA on each molecular configuration, and identifying the major axis for each one (but that could be expensive).

jacob-votava commented 3 months ago

Thank you! I will look into taking an alternative approach that does not depend as heavily on atom typing.

I appreciate both of your help.