libAtoms / QUIP

libAtoms/QUIP molecular dynamics framework: https://libatoms.github.io
350 stars 121 forks source link

rounding in mpi4py bcast of quippy atoms object cell #91

Open noambernstein opened 7 years ago

noambernstein commented 7 years ago

If you create a quippy Atoms object from an ASE object, and then use mpi4py comm.bcast (of the python object), the cell dimensions are rounded off, as though it's identifying it as a single precision float perhaps. Here's the output of the attached script (run with mpirun -np 2 bcast_bug.py). As you can see the initial quippy object is identical to the ASE one, and if you bcast the ASE one it stays the same, but bcasting the quippy object changes 37.797631496846186 to 37.797631500000001. Also, doing a Bcast (low level, on the raw double array) on a copy of the cell, it also keeps the precision.

0 initial ASE volume, cell[0,0] 53999.999999999992724 37.797631496846186 0 initial quippy volume, cell[0,0] 53999.999999999992724 37.797631496846186 0 bcast ASE Atoms volume, cell[0,0] 53999.999999999992724 37.797631496846186 0 bcast quippy Atoms volume, cell[0,0] 54000.000013517215848 37.797631500000001 0 original cell_copy[0,0] 37.797631496846186 0 bcast cell_copy[0,0] 37.797631496846186 1 bcast ASE Atoms volume, cell[0,0] 53999.999999999992724 37.797631496846186 1 bcast quippy Atoms volume, cell[0,0] 54000.000013517215848 37.797631500000001 1 original cell_copy[0,0] 0.000000000000000 1 bcast cell_copy[0,0] 37.797631496846186

bcast_bug.tar.gz

gabor1 commented 7 years ago

WTF

-- Gábor

On 15 November 2017 at 15:25:33, noambernstein (notifications@github.com(mailto:notifications@github.com)) wrote:

If you create a quippy Atoms object from an ASE object, and then use mpi4py comm.bcast (of the python object), the cell dimensions are rounded off, as though it's identifying it as a single precision float perhaps. Here's the output of the attached script (run with mpirun -np 2 bcast_bug.py). As you can see the initial quippy object is identical to the ASE one, and if you bcast the ASE one it stays the same, but bcasting the quippy object changes 37.797631496846186 to 37.797631500000001. Also, doing a Bcast (low level, on the raw double array) on a copy of the cell, it also keeps the precision.

0 initial ASE volume, cell[0,0] 53999.999999999992724 37.797631496846186 0 initial quippy volume, cell[0,0] 53999.999999999992724 37.797631496846186 0 bcast ASE Atoms volume, cell[0,0] 53999.999999999992724 37.797631496846186 0 bcast quippy Atoms volume, cell[0,0] 54000.000013517215848 37.797631500000001 0 original cell_copy[0,0] 37.797631496846186 0 bcast cell_copy[0,0] 37.797631496846186 1 bcast ASE Atoms volume, cell[0,0] 53999.999999999992724 37.797631496846186 1 bcast quippy Atoms volume, cell[0,0] 54000.000013517215848 37.797631500000001 1 original cell_copy[0,0] 0.000000000000000 1 bcast cell_copy[0,0] 37.797631496846186

bcast_bug.tar.gz(https://github.com/libAtoms/QUIP/files/1475246/bcast_bug.tar.gz)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub(https://github.com/libAtoms/QUIP/issues/91), or mute the thread(https://github.com/notifications/unsubscribe-auth/AFAG59JXrcXfv_nFhBAmo-12m9HA2foaks5s2wJtgaJpZM4QfGTK).

noambernstein commented 7 years ago

On Nov 15, 2017, at 10:27 AM, gabor1 notifications@github.com wrote:

WTF

Note that I’ve demonstrated to myself that this happens to any quippy Atoms, not just one that was from a converted ASE Atoms. Probably will require digging into mpi4py.

Noam
jameskermode commented 7 years ago

Could it be that mpi4py uses repr(at) to get a string representation of the Atoms to broadcast?

noambernstein commented 7 years ago

Just noticed that pip now seems mpi4py 3.0, used to be 2.0. Let me see if the bug persists.

jameskermode commented 7 years ago

No idea how mpi4py figures out how to bcast a complex structure like an ASEAtoms instance. For the quippy Atoms, I haven't looked at your code yet, but if you're not already doing so it would be better to call our Fortran MPI implementation of atoms_bcast() in Atoms.f95.

noambernstein commented 7 years ago

This is for the nested sampling code, so I'd rather not specialize. Frankly, I've worked around the underlying issue (the problems caused by roundoff due to this bug could also have been caused by roundoff due to other operations, so I just fixed that), so this isn't that urgent.

jameskermode commented 7 years ago

Ok. I think it's another nail in the coffin of the existence of a separate quippy Atoms class.

noambernstein commented 7 years ago

documentation suggests it uses pickle. Is that plausible behavior?

jameskermode commented 7 years ago

That makes sense. I haven't defined a pickle handler for quippy Atoms, so it will probably be using some default, which might go via strings and lose precision.

tdaff commented 7 years ago
In [22]: pickle.dumps(at_q)
Out[22]: 'cquippy.atoms\nAtoms\np0\n(tRp1\nS\'1\\ncutoff=-1.00000000 nneightol=1.20000000 pbc="T T T" Lattice="37.79763150       0.00000000       0.00000000       0.00000000      37.79763150       0.00000000       0.00000000       0.00000000      37.79763150" Properties=species:S:1:pos:R:3:Z:I:1\\nSi              0.00000000      0.00000000      0.00000000      14\\n\'\np2\nb.'

Might be losing some other things besides precision?