MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.32k stars 651 forks source link

writing LAMMPS output: "TypeError: iteration over a 0-d array" #3945

Closed jamesdalg closed 1 year ago

jamesdalg commented 1 year ago

Expected behavior

Expected Behavior: I want to write a test LAMMPS output file, but the writer generates a typeError: iteration over a 0-d array ``

Actual behavior

/code/python/test_model/venv/bin/python /code/python/test_model/reprex_mda.py [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)] [(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, 15)] [(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, 15)] Traceback (most recent call last): File "/code/python/test_model/reprex_mda.py", line 44, in LW.write(testu.atoms) File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/core/groups.py", line 4599, in check_args return func(*args, **kwargs) File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/coordinates/LAMMPS.py", line 441, in write self._write_dimensions(atoms.dimensions) File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/coordinates/LAMMPS.py", line 355, in _write_dimensions triv = self.convert_pos_to_native(mdamath.triclinic_vectors( File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/lib/mdamath.py", line 352, in triclinic_vectors lx, ly, lz, alpha, beta, gamma = dim TypeError: iteration over a 0-d array

Code to reproduce the behavior

import MDAnalysis as mda
import pandas as pd
#create uni
testdata=pd.DataFrame([0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0])
import numpy as np
num_residues=len(list(set(testdata[0])))
testu = mda.Universe.empty(n_atoms=testdata.shape[0], residue_segindex=np.arange(0,num_residues,1),n_residues=num_residues, atom_resindex=testdata[0], trajectory=True)
num_atoms=testdata.shape[0]
testu.add_TopologyAttr('resid', np.arange(0,num_residues,1) )
testu.add_TopologyAttr('type', testdata[0])
testu.add_TopologyAttr('mass',  [1]*num_atoms)
testu.add_TopologyAttr('charge',  [1]*num_atoms)
#bonds
bonds = []
for o in range(0, testdata.shape[0]-1, 3):
    bonds.extend([(o, o+1), (o, o+2)])

testu.add_TopologyAttr('bonds', bonds)
testu.bonds
#angles
angles = []
for o in range(0, testdata.shape[0]-1, 3):
   angles += [(o, o+1, o+2)]

print(angles)
testu.add_angles(angles)

impropers = []
for o in range(0, testdata.shape[0]-1, 4):
   impropers += [(o, o+1, o+2,o+3)]

print(impropers)
testu.add_impropers(impropers)

dihedrals = []
for o in range(0, testdata.shape[0]-1, 4):
   dihedrals += [(o, o+1, o+2,o+3)]

print(dihedrals)
testu.add_dihedrals(dihedrals)

with mda.coordinates.LAMMPS.DATAWriter('testtest_7bead.data') as LW:
    LW.write(testu.atoms)

....

Current version of MDAnalysis

IAlibay commented 1 year ago

So the error you are specifically encountering appears to be related to a missing dimensions entry which is expected (i.e. you need to set u.atoms.dimensions to something).

That being said, fixing that doesn't necessarily fix the code. You then encounter a TypeError when selecting types which I'm not too sure about from a quick glance @richardjgowers might have a better answer here having last touched that part of the code.

There also appears to be another issue with the LAMMPS writer where I believe this line https://github.com/MDAnalysis/mdanalysis/blob/dd92db83ac6533c39166c724c5425cbe0315a46d/package/MDAnalysis/coordinates/LAMMPS.py#L323 will make it so that types with a 0 value won't have masses? Are type 0 atoms a normal thing? (i.e. should they be supported?).

jamesdalg commented 1 year ago

I've now added positions and a dimensions attribute to the simulation:

import MDAnalysis as mda
import pandas as pd
#create uni
testdata=pd.DataFrame([0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0])
import numpy as np
num_residues=len(list(set(testdata[0])))
testu = mda.Universe.empty(n_atoms=testdata.shape[0], residue_segindex=np.arange(0,num_residues,1),n_residues=num_residues, atom_resindex=testdata[0], trajectory=True)
num_atoms=testdata.shape[0]
testu.add_TopologyAttr('resid', np.arange(0,num_residues,1) )
testu.add_TopologyAttr('type', testdata[0])
testu.add_TopologyAttr('mass',  [1]*num_atoms)
testu.add_TopologyAttr('charge',  [1]*num_atoms)
#positions
num_atoms=testdata.shape[0]
bead = np.array([[ 0.5,        0.5,       0.5      ]])  #choose the middle of the cube.
grid_size = 10
spacing = 8

coordinates = []

for i in range(num_atoms):
    x = spacing * (i % grid_size)
    y = spacing * ((i // grid_size) % grid_size)
    z = spacing * (i // (grid_size * grid_size))

    xyz = np.array([x, y, z])

    coordinates.extend(bead + xyz.T)

print(coordinates[:10])

coord_array = np.array(coordinates)
print(coord_array )
testu.atoms.positions = coord_array

#bonds
bonds = []
for o in range(0, testdata.shape[0]-1, 3):
    bonds.extend([(o, o+1), (o, o+2)])

testu.add_TopologyAttr('bonds', bonds)
testu.bonds
#angles
angles = []
for o in range(0, testdata.shape[0]-1, 3):
   angles += [(o, o+1, o+2)]

print(angles)
testu.add_angles(angles)

impropers = []
for o in range(0, testdata.shape[0]-1, 4):
   impropers += [(o, o+1, o+2,o+3)]

print(impropers)
testu.add_impropers(impropers)

dihedrals = []
for o in range(0, testdata.shape[0]-1, 4):
   dihedrals += [(o, o+1, o+2,o+3)]

print(dihedrals)
testu.add_dihedrals(dihedrals)

offset=5
testu.dimensions=np.asarray([pd.DataFrame(testu.atoms.positions).min()-offset,pd.DataFrame(testu.atoms.positions).max()+offset]).flatten().tolist()

with mda.coordinates.LAMMPS.DATAWriter('testtest_7bead.data') as LW:
    LW.write(testu.atoms)

here's the output:

[array([0.5, 0.5, 0.5]), array([8.5, 0.5, 0.5]), array([16.5,  0.5,  0.5]), array([24.5,  0.5,  0.5]), array([32.5,  0.5,  0.5]), array([40.5,  0.5,  0.5]), array([48.5,  0.5,  0.5]), array([56.5,  0.5,  0.5]), array([64.5,  0.5,  0.5]), array([72.5,  0.5,  0.5])]
[[ 0.5  0.5  0.5]
 [ 8.5  0.5  0.5]
 [16.5  0.5  0.5]
 [24.5  0.5  0.5]
 [32.5  0.5  0.5]
 [40.5  0.5  0.5]
 [48.5  0.5  0.5]
 [56.5  0.5  0.5]
 [64.5  0.5  0.5]
 [72.5  0.5  0.5]
 [ 0.5  8.5  0.5]
 [ 8.5  8.5  0.5]
 [16.5  8.5  0.5]
 [24.5  8.5  0.5]
 [32.5  8.5  0.5]
 [40.5  8.5  0.5]]
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, 15)]
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, 15)]
Traceback (most recent call last):
  File "/code/python/test_model/reprex_mda.py", line 72, in <module>
    LW.write(testu.atoms)
  File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/core/groups.py", line 4599, in check_args
    return func(*args, **kwargs)
  File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/coordinates/LAMMPS.py", line 443, in write
    self._write_masses(atoms)
  File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/coordinates/LAMMPS.py", line 325, in _write_masses
    masses = set(atoms.universe.atoms.select_atoms(
  File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/core/groups.py", line 3228, in select_atoms
    selections[0].apply(self))
  File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/core/selection.py", line 241, in apply
    return self._apply(*args, **kwargs).asunique(sorted=self.parser.sorted)
  File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/core/selection.py", line 221, in _apply
    return func(self, group)
  File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/core/selection.py", line 614, in _apply
    if any(fnmatch.fnmatchcase(nm, val) for val in self.values):
  File "/code/python/test_model/venv/lib/python3.10/site-packages/MDAnalysis/core/selection.py", line 614, in <genexpr>
    if any(fnmatch.fnmatchcase(nm, val) for val in self.values):
  File "/usr/lib/python3.10/fnmatch.py", line 77, in fnmatchcase
    return match(name) is not None
TypeError: expected string or bytes-like object
jamesdalg commented 1 year ago

Is it possible I have to add a trajectory with some arbitrary velocities and if so, how do I do that manually? I haven't seen any documentation on how to go about it other than pull it in from external sources.

jamesdalg commented 1 year ago

It looks like I've done everything I can do at this point. If someone finds a way to fix this writer bug, that would be a huge help! There's a lot of research done with lammps and having the ability to export results to lammps without a crash would be amazing. Perhaps you could adapt my reprex (or something like it) into this page: https://userguide.mdanalysis.org/stable/formats/reference/data.html#data-format. Currently, there's no "from scratch" example on creating a universe that can be written to LAMMPS, from what's in the online documentation. Perhaps there's a working code snippet somewhere that might be able to do this. If so, feel free to point me in that direction.

orbeckst commented 1 year ago

@jamesdalg , we would welcome additions to the User Guide, especially for LAMMPS. The repo for the User Guide is https://github.com/MDAnalysis/UserGuide/ . You can start raising an issue there. And it would be fantastic if you were to contribute examples — start a PR.

orbeckst commented 1 year ago

Part of the problem is that you're assigning integers as types

testu.add_TopologyAttr('type', testdata[0])

but in MDAnalysis, types are strings so you should be doing

testu.add_TopologyAttr('type', [str(x) for x in testdata[0]])

(or something equivalent – it needs to come out as str, and pandas pd.Series.astype(str) did not work for me but np.ndarray.astype(str) did). Arguably, there should be a check somewhere when setting type that the correct type is used.

With this change, you are now able to do selections

>>> testu.select_atoms("type 1")
<AtomGroup with 2 atoms>

as needed by the writer.

Unfortunately, I now get a failure elsewhere

File ~/anaconda3/envs/mda3/lib/python3.8/site-packages/MDAnalysis/coordinates/LAMMPS.py:348, in DATAWriter._write_bonds(self, bonds)
    345 except TypeError:
    346     errmsg = (f"LAMMPS DATAWriter: Trying to write bond, but bond "
    347               f"type {bond.type} is not numerical.")
--> 348     raise TypeError(errmsg) from None

TypeError: LAMMPS DATAWriter: Trying to write bond, but bond type ('0', '0') is not numerical.
jamesdalg commented 1 year ago

I figured it out. It turns out that precise error message is key. Unlike how it is written in the tutorials, the way to add bonds is to use the add_bonds() function, with a type vector the length of the number of bonds (the same goes for angles, impropers, and dihedrals).

import MDAnalysis as mda
import pandas as pd
#create uni
testdata=pd.DataFrame([0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0])
import numpy as np
num_residues=len(list(set(testdata[0])))
testu = mda.Universe.empty(n_atoms=testdata.shape[0], residue_segindex=np.arange(0,num_residues,1),n_residues=num_residues, atom_resindex=testdata[0], trajectory=True)
num_atoms=testdata.shape[0]
testu.add_TopologyAttr('resid', np.arange(0,num_residues,1) )
#testu.add_TopologyAttr('type', list(testdata[0]+1))
testu.add_TopologyAttr('type', [str(x) for x in testdata[0]])
testu.add_TopologyAttr('mass',  [1]*num_atoms)
testu.add_TopologyAttr('charge',  [1]*num_atoms)
#positions
num_atoms=testdata.shape[0]
bead = np.array([[ 0.5,        0.5,       0.5      ]])  #choose the middle of the cube.
grid_size = 10
spacing = 8
#testu.atoms.dimensions=[]
coordinates = []

for i in range(num_atoms):
    x = spacing * (i % grid_size)
    y = spacing * ((i // grid_size) % grid_size)
    z = spacing * (i // (grid_size * grid_size))

    xyz = np.array([x, y, z])

    coordinates.extend(bead + xyz.T)

print(coordinates[:10])

coord_array = np.array(coordinates)
print(coord_array )
testu.atoms.positions = coord_array

#bonds
bonds = []
for o in range(0, testdata.shape[0]-1, 3):
    bonds.extend([(o, o+1), (o, o+2)])

testu.add_bonds(bonds,"1"*len(bonds))
testu.bonds
#angles
angles = []
for o in range(0, testdata.shape[0]-1, 3):
   angles += [(o, o+1, o+2)]

print(angles)
testu.add_angles(angles,"1"*len(angles))

impropers = []
for o in range(0, testdata.shape[0]-1, 4):
   impropers += [(o, o+1, o+2,o+3)]

print(impropers)
testu.add_impropers(impropers,"1"*len(impropers))

dihedrals = []
for o in range(0, testdata.shape[0]-1, 4):
   dihedrals += [(o, o+1, o+2,o+3)]

print(dihedrals)
testu.add_dihedrals(dihedrals,"1"*len(dihedrals))

offset=5
testu.dimensions=np.asarray([pd.DataFrame(testu.atoms.positions).min()-offset,pd.DataFrame(testu.atoms.positions).max()+offset]).flatten().tolist()
for ts in testu.trajectory:
    with mda.coordinates.LAMMPS.DATAWriter('testtest_7bead.data') as LW:
        LW.write(testu.atoms)
    break

This yields the following lammps input file, without error:

LAMMPS data file via MDAnalysis

          16  atoms
          10  bonds
           5  angles
           4  dihedrals
           4  impropers

           1  atom types
           1  bond types
           1  angle types
           1  dihedral types
           1  improper types

0.000000 0.000000 xlo xhi
0.000000 0.000000 ylo yhi
0.000000 0.000000 zlo zhi

Masses

1 1.000000

Atoms

1 0 0 1.000000 0.500000 0.500000 0.500000
2 0 0 1.000000 8.500000 0.500000 0.500000
3 0 0 1.000000 16.500000 0.500000 0.500000
4 0 0 1.000000 24.500000 0.500000 0.500000
5 0 0 1.000000 32.500000 0.500000 0.500000
6 0 0 1.000000 40.500000 0.500000 0.500000
7 0 0 1.000000 48.500000 0.500000 0.500000
8 0 1 1.000000 56.500000 0.500000 0.500000
9 0 1 1.000000 64.500000 0.500000 0.500000
10 0 0 1.000000 72.500000 0.500000 0.500000
11 0 0 1.000000 0.500000 8.500000 0.500000
12 0 0 1.000000 8.500000 8.500000 0.500000
13 0 0 1.000000 16.500000 8.500000 0.500000
14 0 0 1.000000 24.500000 8.500000 0.500000
15 0 0 1.000000 32.500000 8.500000 0.500000
16 0 0 1.000000 40.500000 8.500000 0.500000

Bonds

1 1 1 2
2 1 1 3
3 1 4 5
4 1 4 6
5 1 7 8
6 1 7 9
7 1 10 11
8 1 10 12
9 1 13 14
10 1 13 15

Angles

1 1 1 2 3
2 1 4 5 6
3 1 7 8 9
4 1 10 11 12
5 1 13 14 15

Dihedrals

1 1 1 2 3 4
2 1 5 6 7 8
3 1 9 10 11 12
4 1 13 14 15 16

Impropers

1 1 1 2 3 4
2 1 5 6 7 8
3 1 9 10 11 12
4 1 13 14 15 16
orbeckst commented 1 year ago

It sounds as if you could use your example as a basis for a very useful User Guide article!

jamesdalg commented 1 year ago

Yes. I'd like to. The more this becomes a useful tool and has a larger user base that improves it, the better it will be to help me in my research.

jamesdalg commented 1 year ago

One key issue-- even though the dimensions are set properly, the bounds are set to zero when written to disk.

testu.dimensions
array([-4.5, -4.5, -4.5, 77.5, 13.5,  5.5], dtype=float32)
0.000000 0.000000 xlo xhi
0.000000 0.000000 ylo yhi
0.000000 0.000000 zlo zhi
jamesdalg commented 1 year ago

This then leads to an error like this on runtime:

hwloc/linux: Ignoring PCI device with non-16bit domain.
Pass --enable-32bits-pci-domain to configure to support such devices
(warning: it would break the library ABI, don't enable unless really needed).
hwloc/linux: Ignoring PCI device with non-16bit domain.
Pass --enable-32bits-pci-domain to configure to support such devices
(warning: it would break the library ABI, don't enable unless really needed).
LAMMPS (3 Nov 2022)
  using 48 OpenMP thread(s) per MPI task
Reading data file ...
  orthogonal box = (0 0 0) to (0 0 0)
ERROR on proc 0: Box bounds are invalid or missing (src/domain.cpp:210)
Last command: read_data testtest_7bead.data
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

I can manually set these for now, but I imagine it won't work if you are trying to generate a large number of scripts programmatically.

jamesdalg commented 1 year ago

I figured it out. You also have to add the angles precisely as 90,90,90 or it doesn't work.

testu.dimensions=list([(pd.DataFrame(testu.atoms.positions).max()+offset)*2][0])+[90,90,90]
jamesdalg commented 1 year ago

Also, you really should modify the box afterwards to change the 0 to extend the box in the negative direction or your atoms will hit the periodic boundary.