nbia-astro / skeletor

Parallel PIC code written in Python and based on the skeleton codes provided by PICKSC
GNU General Public License v3.0
1 stars 0 forks source link

py.test problem #91

Closed tberlok closed 7 years ago

tberlok commented 7 years ago

We have been trying to solve this problem all day. I have managed to reproduce the problem on DUNE. From what I can read online, py.test stores everything in memory and then executes it. It is very likely that the different tests are interacting. It could also be a memory issue. There is something called pytest-xdist which should isolate the different tests.

tobson commented 7 years ago

How did Docker for Mac work out?

tberlok commented 7 years ago

I managed to install the travis environment in a docker but I got errors when I tried to run the travis.yml script. I have managed to reproduce the error on Ubuntu though.

tobson commented 7 years ago

I have managed to reproduce the error on Ubuntu though.

Does this mean you can run

mpiexec -n 4 py.test tests/

in the shell and it fails the same way as on Travis?

tberlok commented 7 years ago

Yes, it fails by stalling at

collected 16 items
collected 16 items
collected 16 items

test_EcrossBdrift_along_x.py 
test_EcrossBdrift_along_x.py 
test_EcrossBdrift_along_x.py collected 16 items

test_EcrossBdrift_along_x.py ..
test_EcrossBdrift_along_y.py ..
test_EcrossBdrift_along_y.py 
test_EcrossBdrift_along_y.py 
test_EcrossBdrift_along_y.py ....
test_burgers.py 
test_burgers.py 
test_burgers.py 
test_burgers.py ....
test_copy_guards_with_shear.py 
test_copy_guards_with_shear.py 
test_copy_guards_with_shear.py 
test_copy_guards_with_shear.py ....
test_deposit.py 
test_deposit.py 
test_deposit.py 
test_deposit.py ........
test_extended_grid.py 
test_extended_grid.py 
test_extended_grid.py 
test_extended_grid.py ....
test_gyromotion.py 
test_gyromotion.py 
test_gyromotion.py 
test_gyromotion.py ....
test_ionacoustic.py 
test_ionacoustic.py 
test_ionacoustic.py 
test_ionacoustic.py F
test_plasmafrequency.py F
test_plasmafrequency.py 
tberlok commented 7 years ago

And sometimes it works

test_EcrossBdrift_along_x.py collected 16 items

test_EcrossBdrift_along_x.py collected 16 items

test_EcrossBdrift_along_x.py collected 16 items

test_EcrossBdrift_along_x.py ....
test_EcrossBdrift_along_y.py 
test_EcrossBdrift_along_y.py 
test_EcrossBdrift_along_y.py 
test_EcrossBdrift_along_y.py ....
test_burgers.py 
test_burgers.py 
test_burgers.py 
test_burgers.py ....
test_copy_guards_with_shear.py 
test_copy_guards_with_shear.py 
test_copy_guards_with_shear.py 
test_copy_guards_with_shear.py ....
test_deposit.py 
test_deposit.py 
test_deposit.py 
test_deposit.py ........
test_extended_grid.py 
test_extended_grid.py 
test_extended_grid.py 
test_extended_grid.py ....
test_gyromotion.py 
test_gyromotion.py 
test_gyromotion.py 
test_gyromotion.py ....
test_ionacoustic.py 
test_ionacoustic.py 
test_ionacoustic.py 
test_ionacoustic.py ....

test_plasmafrequency.py test_plasmafrequency.py 
test_plasmafrequency.py 
test_plasmafrequency.py ....
test_sheared_burgers.py 
test_sheared_burgers.py 
test_sheared_burgers.py 
test_sheared_burgers.py ....
test_sheared_disturbance.py 
test_sheared_disturbance.py 
test_sheared_disturbance.py 
test_sheared_disturbance.py ....
test_shearing_epicycle_standard_coordinates.py 
test_shearing_epicycle_standard_coordinates.py 
test_shearing_epicycle_standard_coordinates.py 
test_shearing_epicycle_standard_coordinates.py ....
test_skeletor.py 
test_skeletor.py 
test_skeletor.py 
test_skeletor.py ...
test_translate.py .
test_translate.py 
test_translate.py 
test_translate.py ...
test_twostream.py .
test_twostream.py 
test_twostream.py 
test_twostream.py ...

========================== 16 passed in 37.31 seconds ==========================

========================== 16 passed in 37.33 seconds ==========================

========================== 16 passed in 37.33 seconds ==========================
.

========================== 16 passed in 37.42 seconds ==========================
tobson commented 7 years ago

But most of the time it fails? Then let's go back in time and find out which commit broke the test suite.

tberlok commented 7 years ago

Right, I think I am now leaning towards me messing up the calculate_ihole. I managed to get the following output on Dune:

plot = False

    def test_ionacoustic(plot=False):

        # Quiet start
        quiet = True
        # Number of grid points in x- and y-direction
        nx, ny = 32, 32
        # Average number of particles per cell
        npc = 256
        # Particle charge and mass
        charge = 0.5
        mass = 1.0
        # Electron temperature
        Te = 1.0
        # Dimensionless amplitude of perturbation
        A = 0.001
        # Wavenumbers
        ikx = 1
        iky = 1

        # CFL number
        cfl = 0.5

        # Number of periods to run for
        nperiods = 1

        # Sound speed
        cs = numpy.sqrt(Te/mass)

        # Time step
        dt = cfl/cs

        # Total number of particles in simulation
        np = npc*nx*ny

        # Wave vector and its modulus
        kx = 2*numpy.pi*ikx/nx
        ky = 2*numpy.pi*iky/ny
        k = numpy.sqrt(kx*kx + ky*ky)

        # Frequency
        omega = k*cs

        # Simulation time
        tend = 2*numpy.pi*nperiods/omega

        # Number of time steps
        nt = int(tend/dt)

        def rho_an(x, y, t):
            """Analytic density as function of x, y and t"""
            return charge*(1 + A*numpy.cos(kx*x+ky*y)*numpy.sin(omega*t))

        def ux_an(x, y, t):
            """Analytic x-velocity as function of x, y and t"""
            return -omega/k*A*numpy.sin(kx*x+ky*y)*numpy.cos(omega*t)*kx/k

        def uy_an(x, y, t):
            """Analytic y-velocity as function of x, y and t"""
            return -omega/k*A*numpy.sin(kx*x+ky*y)*numpy.cos(omega*t)*ky/k

        # Create numerical grid. This contains information about the extent of
        # the subdomain assigned to each processor.
        manifold = Manifold(nx, ny, comm, nlbx=1, nubx=2, nlby=1, nuby=1)

        # x- and y-grid
        xg, yg = numpy.meshgrid(manifold.x, manifold.y)

        # Maximum number of electrons in each partition
        npmax = int(1.5*np/comm.size)

        # Create particle array
        ions = Particles(manifold, npmax, charge=charge, mass=mass)

        # Create a uniform density field
        init = InitialCondition(npc, quiet=True)
        init(manifold, ions)

        # Perturbation to particle velocities
        ions['vx'] = ux_an(ions['x'], ions['y'], t=dt/2)
        ions['vy'] = uy_an(ions['x'], ions['y'], t=dt/2)

        # Make sure the numbers of particles in each subdomain add up to the
        # total number of particles
        assert comm.allreduce(ions.np, op=MPI.SUM) == np

        # Set the electric field to zero
        E = Field(manifold, dtype=Float2)
        E.fill((0.0, 0.0, 0.0))
        E.copy_guards()
        B = Field(manifold, dtype=Float2)
        B.fill((0.0, 0.0, 0.0))
        B.copy_guards()

        # Initialize sources
        sources = Sources(manifold, npc)

        # Initialize Ohm's law solver
        ohm = Ohm(manifold, temperature=Te, charge=charge)

        # Calculate initial density and force

        # Deposit sources
        sources.deposit(ions)
        assert numpy.isclose(sources.rho.sum(), ions.np*charge/npc)
        sources.rho.add_guards()
        sources.rho.copy_guards()
        assert numpy.isclose(comm.allreduce(
            sources.rho.trim().sum(), op=MPI.SUM), np*charge/npc)

        # Calculate electric field (Solve Ohm's law)
        ohm(sources, B, E)
        # Set boundary condition
        E.copy_guards()

        # Concatenate local arrays to obtain global arrays
        # The result is available on all processors.
        def concatenate(arr):
            return numpy.concatenate(comm.allgather(arr))

        # Make initial figure
        if plot:
            import matplotlib.pyplot as plt
            from matplotlib.cbook import mplDeprecation
            import warnings

            global_rho = concatenate(sources.rho.trim())
            global_rho_an = concatenate(rho_an(xg, yg, 0))

            if comm.rank == 0:
                plt.rc('image', origin='lower', interpolation='nearest')
                plt.figure(1)
                plt.clf()
                fig, (ax1, ax2, ax3) = plt.subplots(num=1, ncols=3)
                vmin, vmax = charge*(1 - A), charge*(1 + A)
                im1 = ax1.imshow(global_rho, vmin=vmin, vmax=vmax)
                im2 = ax2.imshow(global_rho_an, vmin=vmin, vmax=vmax)
                im3 = ax3.plot(xg[0, :], global_rho[0, :], 'b',
                               xg[0, :], global_rho_an[0, :], 'k--')
                ax1.set_title(r'$\rho$')
                ax3.set_ylim(vmin, vmax)
                ax3.set_xlim(0, manifold.Lx)

        t = 0
        diff2 = 0
        ##########################################################################
        # Main loop over time                                                    #
        ##########################################################################
        for it in range(nt):
            # Push particles on each processor. This call also sends and
            # receives particles to and from other processors/subdomains.
>           ions.push(E, B, dt)

test_ionacoustic.py:161: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../skeletor/particles.py:162: in push
    self.periodic_y()
../skeletor/particles.py:126: in periodic_y
    self.move()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Particles([ (0.03124567180610594, 0.03124567180610594, -8.656387788115387e-06, -8.656387788115387e-06, 4.5),
       (0...','vx','vy','vz'], 'formats':['<f8','<f8','<f8','<f8','<f8'], 'offsets':[0,8,16,24,32], 'itemsize':40, 'aligned':True})

    def move(self):
        """Uses ppic2's cppmove2 routine for moving particles
               between processors."""

        from .cython.ppic2_wrapper import cppmove2

        # Check for ihole overflow error
        if self.ihole[0] < 0:
            ierr = -self.ihole[0]
            msg = "ihole overflow error: ntmax={}, ierr={}"
            raise RuntimeError(msg.format(self.ihole.size - 1, ierr))

        self.np = cppmove2(
                self, self.np, self.sbufl, self.sbufr, self.rbufl,
                self.rbufr, self.ihole, self.info, self.manifold)

        # Make sure particles actually reside in the local subdomain
>       assert all(self["y"][:self.np] >= self.manifold.edges[0])
E       assert all(Particles([ 0.03124567,  0.03124134,  0.03123702, ...,  7.96839754,\n        7.9683974 ,  7.96839732]) >= 0.0)

../skeletor/particles.py:107: AssertionError
tberlok commented 7 years ago

Okay, so master is failing. I guess we just got lucky on travis?

tberlok commented 7 years ago

I am going backwards in time, running the tests several times:

Works: 245ff4c 0334e59621c577dbcaa1a61f2f7df7b94a96b178 Fails: bfbfbd4

tobson commented 7 years ago

?

tobson commented 7 years ago

Oh I see

tberlok commented 7 years ago

Do you have time to Skype?

tobson commented 7 years ago

Yes

tberlok commented 7 years ago

This commit 39e5df8d0a0bafd7a58f88c455b4fcea07485206 seems to fix the problem. I ran the tests 5-6 times without problems. I think we can propagate the change into master.

I will go get some lunch now but we can skype later.

tberlok commented 7 years ago

This was fixed by a5531852dd3302d0ae909adda853ed08873f0cdc.