nothings / stb

stb single-file public domain libraries for C/C++
https://twitter.com/nothings
Other
26.81k stars 7.72k forks source link

stb_perlin.h: add analytical derivatives #1391

Open fp64 opened 2 years ago

fp64 commented 2 years ago

Now that stb_perlin.h is back (yes!) it would be nice to have analytical derivatives of Perlin noise available available out-of-the-box (I assume a bunch of people would appreciate it, not just me). They are expected to be both more accurate and faster then numerical approximations (and just more convenient). The computation is not complicated.

I (presently?) only care about the first-order derivatives. Not sure how much demand is there for higher-order (note: there are only so many interesting derivatives of Perlin noise, at high enough order they are all 0/discontinuity).

I don't much care whether stb_perlin_fbm_noise3, etc., get derivatives (maybe someone else does?).

I have a draft implementation, and can make a pull request if need be.

Regarding accuracy: according to my tests (to be taken with a grain of salt) symmetric difference approximation (with step size 2e-3 which seems to minimize the error) introduces an RMS absolute error of about 3e-5 (8e-6 with -mfpmath=387) compared to analytical derivatives (max. error is 2.5e-4 and 1e-4 respectively).

Regarding speed: my implementation always produces the value of the Perlin noise function itself, and optionally some of the x/y/z partial derivatives. The speed looks like this (benchmarked straight on godbolt, reported CPU is Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz, compiled with -O3 on x64 GCC 12.2):

[_,_,_]: 21.057 ns/call
[x,_,_]: 29.221 ns/call
[_,y,_]: 27.374 ns/call
[x,y,_]: 36.911 ns/call
[_,_,z]: 26.703 ns/call
[x,_,z]: 36.713 ns/call
[_,y,z]: 35.416 ns/call
[x,y,z]: 44.098 ns/call

This roughly corresponds to 20+8*n (where n is the number requested components) nanoseconds per call. No-derivatives case ([_,_,_]) should be essentially the same speed as the current stb_perlin_noise3_internal.

Compared to that, with a numerical approximation, to compute Perlin noise and 3 derivatives, one would need 7 (if using symmetric differences) or 4 (if using forward differences) taps (i.e. calls to stb_perlin_noise3). Note: if using a function specifically to fill an evenly-spaced grid (e.g. a texture), one can get derivatives almost for free (assuming step size equal to grid spacing is acceptable).

Aside (unrelated): stb_perlin_noise3_wrap_nonpow2 exists but is not mentioned in the documentation block. Any reason for that?