Closed espresso-ci closed 5 years ago
Looks like an actual problem. Broken bonds while running the cellsystem sample.
I've seen that one before, so it's a recurring problem. I've run it 2000 times again today on my desktop machine without a single failure, however in both cuda:9.0
and cuda:tutorial
I was able to reproduce the failure after ~1000 runs. To reproduce that error on desktop, add system.integrator.run(1000000)
at the end of the sample, it should take a few seconds (or more) of runtime before a FENE bond breaks.
time | kinetic energy | FENE energy | WCA energy | min bond length | min dist | max velocity |
---|---|---|---|---|---|---|
420.71 | 161.64 | 723.37 | 85.04 | 0.907 | 0.340 | 3.84 |
420.72 | 163.95 | 724.09 | 80.97 | 0.924 | 0.364 | 3.81 |
420.73 | 162.01 | 726.13 | 77.20 | 0.937 | 0.394 | 3.51 |
420.74 | 2.72e+9 | 729.28 | 755327.28 | 0.945 | 0.398 | 36984.66 |
420.75 | crash | ? | ? | ? | ? | ? |
It looks like the WCA potential increases suddenly, even though the minimal distance between all particle pairs does not decrease before the crash.
Did you record the seed of the run that crashed?
Did you record the seed of the run that crashed?
No, but I observed the same trend in multiple independent runs.
Actually, the time at which the FENE bond breaks is not reproducible, even when setting the numpy+system+thermostat+polymer seeds. Could it be due to skin tuning or my use of the visualizer?
The visualizer should not influence the system. Could you please try without the tuning, and if it crashes please post the seeds or a deterministic script.
On Thu, May 9, 2019, 18:46 Jean-Noël Grad notifications@github.com wrote:
Actually, the time at which the FENE bond breaks is not reproducible, even when setting the numpy+system+thermostat+polymer seeds. Could it be due to skin tuning or my use of the visualizer?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/espressomd/espresso/issues/2820#issuecomment-490980210, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG2FXZ7PY3KNIGX3XDVLA3PURIMZANCNFSM4HLW75HA .
Update: skin tuning is not deterministic, and without skin tuning the FENE bonds don't break. MWE:
from __future__ import print_function
import numpy as np
import espressomd
espressomd.assert_features(["LENNARD_JONES"])
from espressomd import polymer
from espressomd import interactions
from scipy.spatial.distance import pdist
box_l = 100
system = espressomd.System(box_l=3 * [box_l])
system.set_random_state_PRNG()
system.seed = system.cell_system.get_state()['n_nodes'] * [1234]
np.random.seed(41)
cs = system.cell_system
cs.skin = .48 * box_l
system.thermostat.set_langevin(kT=1.0, gamma=1.0, seed=42)
system.time_step = 0.01
# WCA and FENE
system.non_bonded_inter[0, 0].lennard_jones.set_params(epsilon=1, sigma=1,
cutoff=2**(1. / 6), shift="auto")
fene = interactions.FeneBond(k=10, d_r_max=1.5)
system.bonded_inter.add(fene)
# polymer
positions = polymer.positions(n_polymers=1, beads_per_chain=100, seed=1234,
bond_length=0.97, min_distance=0.969)
for i, pos in enumerate(positions[0]):
system.part.add(id=i, pos=pos)
if i > 0: system.part[i].add_bond((fene, i - 1))
cs.set_n_square(True)
#system.integrator.run(10000000);exit(0) # uncomment this line to skip tuning
skin = cs.tune_skin(min_skin=0.5, max_skin=50., tol=0.5, int_steps=100)
print('skin =', skin)
system.time = 0
history = 20 * [None]
for i in range(10000000):
min_bond_length = np.min(np.linalg.norm(system.part[1:].pos - system.part[:-1].pos, axis=1))
min_dist = np.min(pdist(system.part[:].pos))
max_vel = np.max(np.linalg.norm(system.part[:].v, axis=1))
del history[0]
history.append((system.time,
system.analysis.energy()['kinetic'],
system.analysis.energy()['bonded'],
system.analysis.energy()['non_bonded'],
min_bond_length, min_dist, max_vel))
if max_vel > 20:
print(i)
break
system.integrator.run(1)
for line in history: print(*line)
print("Bonds are about to break")
system.integrator.run(1)
It'll take a few minutes to crash due to the numpy operations, but you can make it happen much faster by setting the box size to 40 and max_skin to 20.
The skin should not affect the result, but the tuning integrates for an nondeterministic time.
I ran 10 simulations of 30 min with a random numpy seed and a random skin (no tuning) between 0.25 and 0.48 box_l without any bond breaking. As soon as skin tuning is used the bonds will break, unless tuning is followed by a reset of the particle positions:
cs.set_n_square(True)
skin = cs.tune_skin(min_skin=0.5, max_skin=0.48 * box_l, tol=0.5, int_steps=100)
for i, pos in enumerate(positions[0]):
system.part[i].pos = pos
system.integrator.run(20000000) # 30 min
exit(0)
I said it should not have an influence :-)
that's why I tested it :-) I was really surprised that resetting the particle positions would solve it. The tuning.cpp code does not seem to have side effects other than broadcasting FIELD_SKIN, I'll have a look into the domain decomposition code to see where it goes.
The test uses n_square iirc
https://gitlab.icp.uni-stuttgart.de/espressomd/espresso/pipelines/7187