alejandrobll / py-sphviewer

Py-SPHViewer is a framework for rendering cosmological simulations in Python using the Smoothed Particle Hydrodynamics scheme.
https://alejandrobll.github.com/py-sphviewer
GNU General Public License v3.0
73 stars 29 forks source link

No use of `-ffast-math` #11

Closed JBorrow closed 5 years ago

JBorrow commented 5 years ago

Using -ffast-math in the extra compile arguments speeds up the code by about 5x; is there any reason why this isn't used?

alejandrobll commented 5 years ago

Well, because I did not think about that before. Could you please confirm that this makes py-sphviewer a factor of 5 faster? I will be glad to add this as a default compiler flag.

JBorrow commented 5 years ago

It provides a 5x speedup on the following benchmark for me:

from numpy import ones_like, array, float32, zeros
from numpy.random import rand, seed
from time import time

number_of_particles = 100_000
res = 1024

seed(1234)

print("Generating particles")
x = rand(number_of_particles).astype(float32)
y = rand(number_of_particles).astype(float32)
h = rand(number_of_particles).astype(float32) * 0.2
m = ones_like(h)
print("Finished generating particles")

from sphviewer.tools import QuickView

print("Running pySPHViewer")
coordinates = zeros((number_of_particles, 3))
coordinates[:, 0] = x
coordinates[:, 1] = y
h = 1.778_002 * h  # The kernel_gamma we use.

t = time()
qv = QuickView(
    coordinates,
    hsml=h,
    mass=m,
    xsize=res,
    ysize=res,
    r="infinity",
    plot=False,
    logscale=False,
).get_image()
print(f"pySPHViewer took {time() - t} on this problem.")
JBorrow commented 5 years ago

We may want to also take a look at the kernel function implementation, as far as I can see the distance is square rooted only to be squared again, and replacing pow(x, 3) with x * x * x should also confer some speed-up.

MatthieuSchaller commented 5 years ago

@JBorrow be careful though. --fast-math behaves differently on different compilers. It also allows some unsafe optimisations and vioaliation of the normal IEEE-754 floating point standard. It may not always be a safe thing to do.

alejandrobll commented 5 years ago

@JBorrow: given what @MatthieuSchaller says, I would like to look at the ratio of two images, one created with --fast-math and the other without that. I guess that for visualization it does not matter much how predictive is the answer, but I would like to see that the differences are really small.

alejandrobll commented 5 years ago

I checked the code and the answer doesn't change, and I get a speedup of a factor of 3! Thanks, @JBorrow. The flag will be used by default.