Open bbye98 opened 9 months ago
Thanks for raising this issue @bbye98, unfortunately we've not had time to look into this yet.
Is there any way you could share a small version of a non-working netcdf file? I personally don't readily have access to (or the knowledge to use) LAMMPS and wouldn't be able to reproduce the file easily.
@IAlibay Apologies for the late response! I've attached a sample topology and trajectory file below (from a slab LJ-Coulomb system):
Some information on package versions:
❯ lmp
LAMMPS (21 Nov 2023)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
using 1 OpenMP thread(s) per MPI task
❯ python -c "import MDAnalysis as mda; print(mda.__version__)"
2.7.0
❯ python -c "import scipy; print(scipy.__version__)"
1.12.0
Reading the LAMMPS trajectory file using MDAnalysis.Universe
:
>>> u = mda.Universe("example.nc")
Traceback (most recent call last):
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/topology/MinimalParser.py", line 64, in parse
n_atoms = kwargs['n_atoms']
KeyError: 'n_atoms'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/core/universe.py", line 110, in _topology_from_file_like
topology = p.parse(**kwargs)
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/topology/MinimalParser.py", line 67, in parse
n_atoms = reader.parse_n_atoms(self.filename, **kwargs)
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/coordinates/TRJ.py", line 621, in parse_n_atoms
with scipy.io.netcdf_file(filename, mmap=None) as f:
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 279, in __init__
self._read()
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 610, in _read
self._read_dim_array()
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 620, in _read_dim_array
raise ValueError("Unexpected header.")
ValueError: Unexpected header.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/core/universe.py", line 356, in __init__
topology = _topology_from_file_like(self.filename,
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/core/universe.py", line 125, in _topology_from_file_like
raise ValueError(
ValueError: Failed to construct topology from file example.nc with parser <class 'MDAnalysis.topology.MinimalParser.MinimalParser'>.
Error: Unexpected header.
Reading the LAMMPS trajectory file using scipy.io.netcdf_file
:
>>> u = mda.Universe("example.nc")
Traceback (most recent call last):
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/topology/MinimalParser.py", line 64, in parse
n_atoms = kwargs['n_atoms']
KeyError: 'n_atoms'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/core/universe.py", line 110, in _topology_from_file_like
topology = p.parse(**kwargs)
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/topology/MinimalParser.py", line 67, in parse
n_atoms = reader.parse_n_atoms(self.filename, **kwargs)
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/coordinates/TRJ.py", line 621, in parse_n_atoms
with scipy.io.netcdf_file(filename, mmap=None) as f:
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 279, in __init__
self._read()
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 610, in _read
self._read_dim_array()
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 620, in _read_dim_array
raise ValueError("Unexpected header.")
ValueError: Unexpected header.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/core/universe.py", line 356, in __init__
topology = _topology_from_file_like(self.filename,
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/core/universe.py", line 125, in _topology_from_file_like
raise ValueError(
ValueError: Failed to construct topology from file example.nc with parser <class 'MDAnalysis.topology.MinimalParser.MinimalParser'>.
Error: Unexpected header.
>>> from scipy.io import netcdf_file
>>> nc = netcdf_file("example.nc", "r")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 279, in __init__
self._read()
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 610, in _read
self._read_dim_array()
File "/home/bye/miniconda3/envs/research/lib/python3.9/site-packages/scipy/io/_netcdf.py", line 620, in _read_dim_array
raise ValueError("Unexpected header.")
ValueError: Unexpected header.
Reading the LAMMPS trajectory file using netCDF4.Dataset
:
>>> import netCDF4 as nc
>>> nc_ = nc.Dataset("example.nc")
>>> nc_
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_64BIT_DATA data model, file format NETCDF3):
Conventions: AMBER
ConventionVersion: 1.0
program: LAMMPS
programVersion: 21 Nov 2023
dimensions(sizes): frame(11), atom(1576), cell_spatial(3), cell_angular(3), label(10), spatial(3)
variables(dimensions): |S1 spatial(spatial), |S1 cell_spatial(spatial), |S1 cell_angular(spatial, label), float32 time(frame), float32 cell_origin(frame, cell_spatial), float32 cell_lengths(frame, cell_spatial), float32 cell_angles(frame, cell_angular), int32 id(frame, atom), int32 type(frame, atom), float32 q(frame, atom), float32 coordinates(frame, atom, spatial), int32 ix(frame, atom), int32 iy(frame, atom), int32 iz(frame, atom), float32 velocities(frame, atom, spatial), float32 forces(frame, atom, spatial)
groups:
For what it's worth, I recently dropped support for scipy.io.netcdf_file
altogether in favor of netCDF4.Dataset
in my workflow because the former had bizarre behavior when used as an OpenMM trajectory writer, such as randomly writing the first frame of the simulation three times--once at the beginning and twice at the end of the trajectory file--despite correctly flushing/syncing and closing the NetCDF file at the end of the simulation. The SciPy implementation also doesn't report the length of an "unlimited" dimension (like time), unlike netCDF4.
This file is a bit strange: it starts with CDF5
instead of the usual and expected CDF2
. Reading around, CDF2
seems to indicate 64-bit offsets (which are required by the format: https://ambermd.org/netcdf/nctraj.xhtml), and CDF5
seems to indicate 64-bit data and offsets (which would also moves the header around, explaining the error from scipy).
The code for this seems to have been around for a while in the netcdf-c library, not sure what caused this change in the output of LAMMPS. Maybe something about the specific version of netcdf LAMMPS is using?
@bbye98 did you find out if anything changed in the LAMMPS netcdf output?
Hi! Apologies for the late response. I'm not sure which version of NetCDF LAMMPS is using, but I can try installing an older version of LAMMPS (say, from 2021) to test.
Expected behavior
I have run a LAMMPS simulation of a simple electrolyte–solvent systems confined between two planar dielectric electrodes. During the simulation, the topology is written to LAMMPS dump file
cg_lj_coul_reduced_lammps.topology
and the trajectory is written to a NetCDF filecg_lj_coul_reduced_lammps.nc
.I am attempting to load both files to a
MDAnalysis.Universe
object usingPython 3 code
import os import MDAnalysis as mda os.chdir("/mnt/e/research/test/poisson_potential/data") fname = "cg_lj_coul_reduced_lammps" universe = mda.Universe(f"{fname}.topology", f"{fname}.nc", topology_format="DATA", atom_style="id type q x y z vx vy vz")Actual behavior
Instead, I get the following
ValueError
stemming from a header check inscipy.io.netcdf_file
:Python 3 stack trace
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[35], line 7 5 os.chdir("/mnt/e/research/test/poisson_potential/data") 6 fname = "cg_lj_coul_reduced_lammps" ----> 7 universe = mda.Universe(f"{fname}.topology", f"{fname}.nc", topology_format="DATA", 8 atom_style="id type q x y z vx vy vz") File ~/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/core/universe.py:375, in Universe.__init__(self, topology, all_coordinates, format, topology_format, transformations, guess_bonds, vdwradii, fudge_factor, lower_bound, in_memory, in_memory_step, *coordinates, **kwargs) 370 coordinates = _resolve_coordinates(self.filename, *coordinates, 371 format=format, 372 all_coordinates=all_coordinates) 374 if coordinates: --> 375 self.load_new(coordinates, format=format, in_memory=in_memory, 376 in_memory_step=in_memory_step, **kwargs) 378 if transformations: 379 if callable(transformations): File ~/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/core/universe.py:580, in Universe.load_new(self, filename, format, in_memory, in_memory_step, **kwargs) 577 # supply number of atoms for readers that cannot do it for themselves 578 kwargs['n_atoms'] = self.atoms.n_atoms --> 580 self.trajectory = reader(filename, format=format, **kwargs) 581 if self.trajectory.n_atoms != len(self.atoms): 582 raise ValueError("The topology and {form} trajectory files don't" 583 " have the same number of atoms!\n" 584 "Topology number of atoms {top_n_atoms}\n" (...) 588 fname=filename, 589 trj_n_atoms=self.trajectory.n_atoms)) File ~/miniconda3/envs/research/lib/python3.9/site-packages/MDAnalysis/lib/util.py:2553, in store_init_arguments.I get the same error when I try loading the trajectory file directly using
scipy.io.netcdf_file
, but not usingnetCDF4.Dataset
:Python 3 code
from netCDF4 import Dataset from scipy.io import netcdf_file print(Dataset(f"{fname}.nc", "r")) print(netcdf_file(f"{fname}.nc", "r"))Python 3 output + stack trace
Code to reproduce the behavior
This is the LAMMPS script used to generate the topology and trajectory files:
LAMMPS script
# ==================================================================== # # PARAMETERS AND SETTINGS # # ==================================================================== # shell cd /mnt/e/research/test/poisson_potential/data log cg_lj_coul_reduced_lammps.log variable N equal 1000 variable T_reduced equal 1 variable mass_reduced equal 1 variable epsilon_reduced equal 1 variable sigma_reduced equal 1 variable x equal 0.5 variable N_ion equal v_x*v_N/2 variable q_reduced equal 13.625994230719545 variable varepsilon_r equal 78 variable L_x_reduced equal 8 variable L_y_reduced equal 7.794228634059947 variable L_z_reduced equal 20.046884346862004 variable sigma_ion_wall_reduced equal v_sigma_reduced/2 variable q_wall_reduced equal -0.19030322569642577 variable unit_cell_x_reduced equal v_sigma_ion_wall_reduced/2 variable unit_cell_y_reduced equal v_sigma_ion_wall_reduced*sqrt(3)/2 variable unit_cell_z_reduced equal v_sigma_ion_wall_reduced*sqrt(6)/3 variable every equal 1000 variable timesteps equal 1100000 variable rng1 equal 12798 variable rng2 equal 9990 variable rng3 equal 22263 variable rng4 equal 101763 # ==================================================================== # # SYSTEM AND PARTICLES # # ==================================================================== # boundary p p f atom_style charge region box block 0 ${L_x_reduced} 0 ${L_y_reduced} 0 ${L_z_reduced} units box create_box 5 box mass * ${mass_reduced} region electrolyte block 0 ${L_x_reduced} 0 ${L_y_reduced} ${sigma_ion_wall_reduced} $(v_L_z_reduced-v_sigma_ion_wall_reduced) units box create_atoms 1 random ${N_ion} ${rng1} electrolyte create_atoms 2 random ${N_ion} ${rng2} electrolyte create_atoms 3 random $(v_N-2*v_N_ion) ${rng3} electrolyte set type 1 charge ${q_reduced} set type 2 charge -${q_reduced} set type 3 charge 0 group electrolyte type 1 2 3 lattice hcp $((v_unit_cell_z_reduced/0.916486)^-3) region left_wall block 0 ${L_x_reduced} 0 ${L_y_reduced} 0 ${unit_cell_z_reduced} units box create_atoms 4 region left_wall lattice hcp $((v_unit_cell_z_reduced/0.916486)^-3) origin 0 0 0.552307414567 region right_wall block 0 ${L_x_reduced} 0 ${L_y_reduced} $(v_L_z_reduced-v_unit_cell_z_reduced) ${L_z_reduced} units box create_atoms 5 region right_wall set type 4 charge 0 set type 5 charge 0 group electrodes type 4 5 # ==================================================================== # # FORCE FIELD # # ==================================================================== # dielectric ${varepsilon_r} kspace_style pppm 1e-4 kspace_modify slab 3 pair_style lj/cut/coul/long $(2^(1/6)*v_sigma_reduced) pair_coeff *3 *3 ${epsilon_reduced} ${sigma_reduced} pair_coeff 4* 4* 0 0 pair_coeff *3 4* $(100*v_epsilon_reduced) ${sigma_ion_wall_reduced} $(2^(1/6)*v_sigma_ion_wall_reduced) pair_modify shift yes velocity electrolyte create ${T_reduced} ${rng4} velocity electrodes set 0 0 0 fix nvt electrolyte nvt temp ${T_reduced} ${T_reduced} $(100*dt) fix electrodes electrodes setforce 0 0 0 # ==================================================================== # # ENERGY MINIMIZATION # # ==================================================================== # minimize 1e-6 1e-6 1000 1000 reset_timestep 0 set type 4 charge ${q_wall_reduced} set type 5 charge $(-v_q_wall_reduced) # ==================================================================== # # SIMULATION # # ==================================================================== # write_data cg_lj_coul_reduced_lammps.topology dump dump all netcdf ${every} cg_lj_coul_reduced_lammps.nc id type q x y z ix iy iz vx vy vz fx fy fz thermo ${every} thermo_style custom step spcpu cpuremain temp press ke evdwl ecoul elong run ${timesteps}I do not believe this error is related to the NetCDF format, as the
NetCDFFile
/NetCDFReporter
I wrote for OpenMM also uses the same format (NETCDF3_64BIT_DATA
), but is recognized correctly by bothnetCDF4.Dataset
andscipy.io.netcdf_file
, and thus MDAnalysis.Current version of MDAnalysis