UK-MAC / CloverLeaf3D_OpenCL

OpenCL port of CloverLeaf3D
1 stars 1 forks source link

Final Kinetic Energy Results #1

Open gihanmudalige opened 9 years ago

gihanmudalige commented 9 years ago

Hi, I was doing some benchmarks of this version of CloverLeaf and was getting a different kinetic energy result when using different MPI processes. Also it appears that the 1 MPI Kinetic Energy result is not correct either.

This is the clover.in input I am using:

*clover state 1 density=0.2 energy=1.0 state 2 density=1.0 energy=2.5 geometry=cuboid xmin=0.0 xmax=5.0 ymin=0.0 ymax=2.0 zmin=0.0 zmax=4.0

x_cells=96 y_cells=96 z_cells=96

xmin=0.0 ymin=0.0 zmin=0.0 xmax=10.0 ymax=10.0 zmax=10.0

initial_timestep=0.1 max_timestep=0.1 end_step=2955 end_time=1000.1 !profiler_on=0

use_opencl_kernels opencl_vendor=nvidia opencl_type=GPU opencl_device=0

!profiler_on=1

*endclover

I get the following for 1 MPI, 2MPI, 4 MPI and 8MPI tasks on our K80 serve (we have 4 K80 GPUs with each K80 having 2 GPUs per card)

1MPI - 0.38444089E+00 2MPI - 0.11546208E+01 4MPI - 0.37709287E+00 8MPI - 0.86353637E+00

Thanks

waynegaudin commented 9 years ago

That's definitely broken. It should be bitwise. Sometimes ieee is needed for bitwise but that's very wrong. On 22 Jun 2015 11:20, "Gihan R. Mudalige" notifications@github.com wrote:

Hi, I was doing some benchmarks of this version of CloverLeaf and was getting a different kinetic energy result when using different MPI processes. Also it appears that the 1 MPI Kinetic Energy result is not correct either. This is the clover.in input I am using:

*clover state 1 density=0.2 energy=1.0 state 2 density=1.0 energy=2.5 geometry=cuboid xmin=0.0 xmax=5.0 ymin=0.0 ymax=2.0 zmin=0.0 zmax=4.0

x_cells=96 y_cells=96 z_cells=96

xmin=0.0 ymin=0.0 zmin=0.0 xmax=10.0 ymax=10.0 zmax=10.0

initial_timestep=0.1 max_timestep=0.1 end_step=2955 end_time=1000.1 !profiler_on=0

use_opencl_kernels opencl_vendor=nvidia opencl_type=GPU opencl_device=0

!profiler_on=1 *endclover

I get the following for 1 MPI, 2MPI, 4 MPI and 8MPI tasks on our K80 serve (we have 4 K80 GPUs with each K80 having 2 GPUs per card)

1MPI - 0.38444089E+00 2MPI - 0.11546208E+01 4MPI - 0.37709287E+00 8MPI - 0.86353637E+00

Thanks

— Reply to this email directly or view it on GitHub https://github.com/UK-MAC/CloverLeaf3D_OpenCL/issues/1.