mosdef-hub / mbuild

A hierarchical, component based molecule builder
https://mbuild.mosdef.org
Other
171 stars 79 forks source link

Floating point error converting .gro to .xyz #1182

Closed dyukovsm closed 4 months ago

dyukovsm commented 4 months ago

Bug summary

When loading .gro file using mb_traj_file = mb.load(trajectory_file), and saving it as mb_traj_file.save(xyz_file, overwrite=True), the data is oftentimes saved with an error of 1/10th of the last specified digit, due to the higher precision of an xyz file. For example, 4.508 in y coordinate in gromacs can be saved by mbuild as 45.079999, while simply copying, converting nm->A and pasting it using core python and pandas library saves y coordinate correctly as 45.080000. Although this error won't effect using mbuild for building simulations, but using mbuild to help post process data can introduce propagation errors.

Code to reproduce the behavior

import mbuild as mb import pandas as pd

def gro_to_xyz(gro_filename, xyz_filename):
    # Read the .gro file, skipping the header and footer
    with open(gro_filename, 'r') as file:
        lines = file.readlines()[2:-1]  # Skip first 2 lines and the last line

    # Process the lines to extract needed information
    data = []
    for line in lines:
        # Extract atom type and coordinates; assume atom type is at indices 10:15 and coordinates at 20:28, 28:36, 36:44
        print(line)
        atom = line[10:15].strip()
        x = float(line[20:28].strip()) * 10  # Convert from nm to Ångströms
        y = float(line[28:36].strip()) * 10
        z = float(line[36:44].strip()) * 10
        data.append([atom, x, y, z])

    # Convert to DataFrame for easier handling
    df = pd.DataFrame(data, columns=['Atom', 'X', 'Y', 'Z'])

    # Write to XYZ format
    with open(xyz_filename, 'w') as xyz_file:
        # Write the number of atoms
        xyz_file.write(f"{len(df)}\n")
        # Write the comment line
        xyz_file.write("written manually using simple python\n")
        # Write the atom lines
        for index, row in df.iterrows():
            xyz_file.write(f"{row['Atom']:2}{row['X']:12.6f}{row['Y']:12.6f}{row['Z']:12.6f}\n")

def main():

    trajectory_file = 'round_err_test.gro'; xyz_file = 'mbuild_converted_output.xyz'

    mb_traj_file = mb.load(trajectory_file)
    mb_traj_file.save(xyz_file, overwrite=True)

    manual_xyz_file = 'manual_converted_output.xyz'

    gro_to_xyz(trajectory_file, manual_xyz_file)

if __name__ == '__main__':
    main()

Please include a code snippet that can be used to reproduce this bug.

trajectory_file = 'round_err_test.gro'; xyz_file = 'mbuild_converted_output.xyz'

mb_traj_file = mb.load(trajectory_file) mb_traj_file.save(xyz_file, overwrite=True)

example_and_code.zip

Software versions

mbuild 0.16.3 pyhd8ed1ab_0 conda-forge python 3.10.12 hd12c33a_0_cpython conda-forge

dyukovsm commented 4 months ago

mb_traj_file = mb.load(trajectory_file, backend='gmso') completely fixes the issue, now both manually written and mbuild produced files don't have floating point errors.