Implementing another kernel and curiosity about derivative=gradient aspect

Hi!

Thank you very much for this effort. I have spend quite a bit of time on this code to produce some nice examples of what I want to ask about, so I hope you enjoy the read :)

Basically I had to import a kernel I know just to be sure about what I am doing - I have looked at how you code and tried to replicate it as close as possible - seems to also improve performance which is very nice, benchmarks at the bottom. Extending the code was quite easy to, you have made a very nice setup for this kind of thing.

I don't agree with your notation saying that grad(W) = kernel_deriv_dim and you can see the full explanation of why in kernel_grad_2d. What are your thoughts about this?

If you agree I think it should be split in kernel_value, kernel_deriv (dWdq) and kernel_gradient. Else I am looking forward to learn if I have misunderstood something; which could also be likely.

The code should run instantly without any other dependencies than the library here.

Kind regards

using SPHKernels

# Kernel from here, I took Quintic (1 liner, love it like that)
#https://github.com/DualSPHysics/DualSPHysics/wiki/3.-SPH-formulation#31-smoothing-kernel

const αD1 = NaN
const αD2 = Float64(7/(4π))
const αD3 = Float64(21/(16π))

# Why is n_neighbours important?
# How would we know before hand, this can only be a rough estimate?
struct DSPH_Quintic{T} <: SPHKernel
    n_neighbours::Int64
    norm_1D::T
    norm_2D::T
    norm_3D::T
end

DSPH_Quintic(T::DataType=Float64, n_neighbours::Integer=216) = DSPH_Quintic{T}(n_neighbours, αD1, αD2, αD3)

@inline function SPHKernels.kernel_value_2D(kernel::DSPH_Quintic{T}, u::Real, h_inv::Real) where T

    # Option to disable if statement to speed up code?
    # Sometimes one can use "strict" neighbour finding in post-processing
    # which makes this condition obsolete.
    # After Thought: This way of writing is very fast anyways.
    if u < 2.0
        n = kernel.norm_2D * h_inv^2

        u_m1 = (1.0-u/2.0)
        u_m2 = (2.0*u+1.0)

        return (u_m1 * u_m1 * u_m1 * u_m1) * u_m2 * n |> T
    else
        return 0.0 |> T
    end

end

# I think that you potentially might have had a mathematical misunderstanding here
# as I have had my self in the past. The gradient of the kernel is not equal to the derivative 
# of the kernel. Because of the chain rule

# The derivative: dWdq (I would call this kernel_deriv)

# The gradient(W) = dWdq * (x - y)/|x-y| * (1/h) (I would call this kernel_grad)

# If you think I am wrong, please do challenge me on this point, I want to learn too! :-)

@inline function kernel_grad_2D(kernel::DSPH_Quintic{T}, u::Real, h_inv::Real,x::AbstractArray,y::AbstractArray,r::Real) where T

    n = kernel.norm_2D * h_inv^2

    # I am used to defining u as q
    # The derivative of the kernel, W(q), defined as dWdq is given in this case as:

    # W(q)    = (1-q/2)^4 * (2q + 1)

    # dWdq(q) = (5/8) * q * (q-2)^3

    if u < 2.0
        u_d1 = 5.0/8.0
        u_d2 = (u - 2.0)

        dWdq = u_d1 * u * (u_d2 * u_d2 * u_d2) |> T
    else
        return @. [0.0;0.0] |> T
    end

    # Calculate result for each direction

    x_diff = x[1] - y[1]
    y_diff = x[2] - y[2]

    # Notice that terms are common except "_diff"
    tmp = dWdq*r*h_inv

    Wgx = tmp*x_diff |> T
    Wgy = tmp*y_diff |> T

    # Return values

    return @. [Wgx;Wgy] |> T
end

# Play

k     = DSPH_Quintic()
u1    = 2.0
h     = 0.028284271247
h_inv = 1.0/h;

@time  v = kernel_value_2D(k, u1, h_inv)

x     = [2.0;2.00]
y     = [2.0;2.15]
r     = sum((x.-y).^2)
u2    =  r/h

Wgx,Wgy =  kernel_grad_2D(k, u2, h_inv,x,y,r)

# And to check consistency we flip x and y_diff
Wgx_f,Wgy_f =  kernel_grad_2D(k, u2, h_inv,y,x,r)

# Same result but opposite signs! This is more clearly shown when x[1] != y[1], disregard sign on "-0".

## BenchmarkTools

# @btime  v = kernel_value_2D($k, $u1, $h_inv)
# @btime  v = kernel_value_2D($k, $u1, $h_inv)
#   0.001 ns (0 allocations: 0 bytes)
# 0.0

# @btime Wgx,Wgy =  kernel_grad_2D($k, $u2, $h_inv,$x,$y,$r)
#   47.827 ns (2 allocations: 192 bytes)

# 2 allocs match allocating Wgx and Wgy, so seems like best possible case, but I don't understand how kernel_value can have absolute zero allocations
# Can you see an improvement in the grad function?

Hi Ahmed,

thank you for your nice message and your good work on implementing a different kernel!

To adress your questions and points:

grad(W): Yes, you are correct! Calling ∇𝒲 the derivative is sloppy and inconsistent. I usually compute the actual gradient seperately, but it would make sense to supply an additional function to do it, as you did. Should we rename ∇𝒲₃ to d𝒲₃ (and the same for the other dimentsions) ?
n_neighbours: I originally implemented this as a reference point. In the Gadget code you can define a range number to define how far the actual numbers of neighbours can deviate from the defined number. I have not made use of that for my work though, so it could also be dropped.
Disabling the if statement: I think that would make it not very readable, so I'd avoid that. Unless you have a good suggestion how to do that elegantly? As you already noted the if statement uses up barely any of the compute time.

Please note that I defined the kernels so far to have u = r/h where r is the euclidean distance and h is the compact kernel support, as you already mentioned in your other comment. This should be in the range x ∈ [0, 1]. Your kernel seems to use x ∈ [0, 2]. Other then that please feel free to implement the kernels you want in your forked repo and then issue a pull request to the main repo! :)

I found some performance improvements for your gradient computation:

function kernel_grad_2D(kernel::DSPH_Quintic{T}, u::Real, h_inv::Real, x::AbstractArray, y::AbstractArray, r::Real) where T

    n = kernel.norm_2D * h_inv^2

    if u < 2
        u_d1 = 5/8
        u_d2 = (u - 2)

        dWdq = u_d1 * u * (u_d2 * u_d2 * u_d2) |> T
    else
        return 0.0, 0.0 .|> T
    end

    # Calculate result for each direction
    x_diff = x[1] - y[1]
    y_diff = x[2] - y[2]

    tmp = dWdq*r*h_inv

    Wgx = tmp*x_diff |> T
    Wgy = tmp*y_diff |> T

    # returning as Tuple instead of Array avoids 1 allocation
    # doing the type conversion only once gets rid of the other allocation
    return Wgx, Wgy 
end

# BenchmarkTools:
@btime Wgx, Wgy =  kernel_grad_2D($k, $u2, $h_inv,$x,$y,$r)
#   4.870 ns (0 allocations: 0 bytes)

I like your idea of having an actual gradient implementation in the package! Maybe it would be more convenient to have a more general implementation, like so:

function kernel_grad_2D(k::SPHKernel, h_inv::Real, pos1::Vector{<:Real}, pos2::Vector{<:Real})

    x_diff = pos1[1] - pos2[1]
    y_diff = pos1[2] - pos2[2]

    r = √(x_diff^2 + y_diff^2)
    u = r * h_inv
    dW = kernel_deriv_2D(k, u, h_inv)

    tmp = dW * u
    Wgx = tmp * x_diff
    Wgy = tmp * y_diff

    return Wgx, Wgy
end

What do you think?

Thanks for the response!

Great to see you agreeing to change the name of the functionality, even if potentially might break some older scripts. It makes it much clearer to the general user. Out of curiosity though, can you tell me where you have used dW only? Most of the time I only use W or grad(W) (I come from CFD background) and never dW (only in the gradient term).

I think that n_neighbours should be removed. It has no relevant meaning for the kernel function, and I have never seen it being used (again mostly from CFD background) so I hope that we can agree to that, unless you have an extremely clear and common use case. Just to simplify it a bit.

Regarding the if-statement, I have to agree to keep it that way. We cannot guarantee that every kernel above kh will return zero. In my checks disabling it would give ~0.2 ns of performance per calculation, which in most cases would be irrelevant.

Regarding the kernel distance, u = r/h or u = r/2h, I must say that I have never seen the previous form before (ofc now you have showed me 😊). Is it common to use it that way? In CFD SPH (Computational Fluid Dynamics Smoothed Particle Hydrodynamics), I have only ever seen the latter. Would it be possible to accept that different kernels can use different formulations of u? Perhaps we can call yours "u" and the one I am using "q", just as something visual? The kernels I am using are derived mostly using the principle of u = r/kh, where k commonly taken as 2, some cases 3 etc. You can see a lot of kernels defined here in the notation I am most used to:

https://pysph.readthedocs.io/en/latest/reference/kernels.html

Thanks for the improved function in regards to allocations! I ended up doing something similar - I was a bit surprised that [] compared to () would give 2 vs 0 allocs, but I guess thats how it is. Continuing on, I think it would be great to have the gradient and perhaps even the Laplacian in this package! Would make it the "go to" Julia package for SPH interpolations.

I am just curios if one of us made a typo, I see that you write it as:

u = r * h_inv
dW = kernel_deriv_2D(k, u, h_inv)
tmp = dW * u

Where we now agree on dW (except r/h or 2h), but I believe that your "u" in tmp should have been (1/r)*h_inv (the u used in dW is correct), here the full equation as I understand it:

Denominator is r and nominator is basically x_diff etc. Sorry if I am being a bit pedantic, I just really want to be sure I am understanding correctly.

Another Idea for Future Discussions I would love to see if we can abstract away from using W1, W2, W3, but that is for a future discussion. My purpose with this would be to make the code easier maintainable, since a lot of the times I believe the only change is n = h_inv^dim.

Sorry if the message is not formatted as well as yours, I believe I have answered your points though and very happy to see we agree on a lot of it.

Kind regards, Ahmed

Hi Ahmed,

I've used d𝒲 only when computing the divergence and rotation of a SPH quantity, so I computed the gradient by hand there as well. That could also be implemented into the package though, like you suggested.

For the kernel definition I followed the notation used in Gadget2 since I work with a variant of that code. So it's just a different normalisation. While I agree that this notation is uncommon and it might be reasonable to change this for the sake of other users, I'm not too sure about it at the moment. All my other SPH packages (public and private) use this notation (and so does Gadget of course), so I'd have to rewrite a lot of stuff and I don't really have the time for that at the moment...

Good catch on that typo! ;) You're correct, that was wrong, I fixed it in the meantime, since I did some more refactoring. I will push these changes to a new branch and link the commit to this issue.

I like your Idea with getting rid of the different dimensionalities! Let's maybe discuss this further: I'd like to implement something like more hierarchical structs. Something like:

struct WendlandC6_3D{T} <: SPHKernel
    norm::T
end

and then send this into multiple dispatch. Then we could reduce the functions to 𝒲, d𝒲 and so on.

What do you think about that? Do you see a more elegant way to do it?

And if we're considering breaking changes we could also change SPHKernel to AbstractSPHKernel to follow the Julia convention better.

Another thing to consider is version numbers: When I added this package I have to admit that I didn't know nearly enough about git (and I doubt that I do now). So I had to fight a bit to get the version number working and the only version number I could choose at that point was 1.0. Which is of course not great, since this package was no-where near production ready.

Would you prefer if these changes were added to a potential v2.0? I think that would be the correct version number to choose, but it again misrepresents the state of the package...

I'm looking forward to your input!

Cheers, Ludwig

Hi again!

In regards to dW Okay I see, in regards to dW, we have been using it for the same things then - just wanted to double check that I did not miss anything my self

In regards to u or q For now I think the course of action for u = r/h / q = r/2h, should be sidelined for a bit - it is a detail which we can always decide on later, either by:

Accept that different kernels have different representations of u
Normalize all kernels to one of the two cases
would be preferred in the sense that we can implement each kernel in the way which is most "true" to its original nature. 2. would allow an extremely easy change of kernel, but might potentially be annoying in the use of the kernel by an user. For now I think I just post this part into a new issue which we can discuss later?

With respect to the Typo Awesome, then we agree on the basis of the operation completely which is always nice on the internet.

Regarding Version Numbers In general as you describe is what is done in Julia community. Of course having started with "1.0" now, it makes no sense to go back to 0.1 etc. My suggestion would be to have these next releases named 1.0.1, 1.0.2 etc or similar, until we reach a version we actual believe in being "beta-test ready", which we can call 1.1. When this is done, we jump the number up to 2 - in this way we ensure a gradual stage process from your original concept, to this play-phase which ends in a beta-phase that leads to version 2.

Perhaps a list of some kind for overall version 2 could be made:

[ ] Agree on u and q issue
[ ] Implement Gradient
[ ] Implement Divergence
[ ] etc.

If you like this approach, I can make a full list, so we do not forget anything? This would be my approach.

Simplifying W and dW (i.e. avoiding W1, W2 etc.)

I am thinking a lot about this and I think I might have something to share later this week. My main issue is dealing with h^dim, since the constants have to be provided anyways. One thought I have is to add a "dim" variable to the function - it will not look as pretty, but will allow for easier change of dimensions, rather than having to change for example, 5 W1 to W2, if one was working on a script. Will need a bit more time on this.

AbstractSPHKernel

Renaming sounds fine! I don't know so much about this part. I imagine that the values this struct has to hold must be (my Julia notation is faulty):

struct AbstractSPHKernel(dim)
    W::KernelFunction(dim) #function - dim goes into here and feeds the choice of W1,W2,W3 (just to explain)
    dW::KernelFunctionDerivative #function
end

Perhaps we need to include the "n" that I call aD (alpha D) in the AbstractSPHKernelClass, since both W and dW needs it

Where "W" in this case will return a function with the specified parameters, so basically as before but a "under-the-hood" generation of W1,W2 etc.

And by thinking about the math I realised that for every operation, gradient, divergence or cross product it is about using this term: So we should have a function, which takes any arbitrary AbstractSPHKernel and allows to calculate these properties, for example the gradient based on the dW given in the struct. Syntax perhaps like:

function dW(ASPHK::AbstractSPHKernel,u::Real)

Where a generalized function dW would be use to sub-run any AbstractSPHKernel dW function - does my idea make sense to you?

I will stop this message here so I do not derail this constructive process, looking forward to hearing your reception of my thoughts

Kind regards

Potential breakthrough in the process of automatic differentiation, see my question and answer 4. Basically the summary is that by defining just the Wendland function it is possible to automatically take its derivative (dW) - this is highly useful for us!

https://discourse.julialang.org/t/can-i-expect-forwarddiff-to-give-the-same-performance-in-this-case/67894/4

Kind regards

Hi Ahmed,

You're right we should discuss this in seperate issues. I opened #13, #14 and #15.

For your suggestion with the AbstractSPHKernel: I would like to avoid to have a struct of Functions. This is too close to OOP for me and I don't think that's what we'd want. I'm also not sure about how efficiently Julia can optimize these kind of structs and if there's maybe too much performance loss.

Very interesting that ForwardDiff is that performant! I was not aware of that, as I've never really used it. Nonetheless, we have analytic solutions for the derivatives so I'm not sure if there's much benefit in using a numeric differentiation tool... But maybe it's worth having it as an option?

I will close this issue now in favor of #13, #14 and #15.

LudwigBoess / SPHKernels.jl

Implementing another kernel and curiosity about derivative=gradient aspect #11