lab-cosmo / sphericart

Multi-language library for the calculation of spherical harmonics in Cartesian coordinates
https://sphericart.readthedocs.io/en/latest/
MIT License
73 stars 12 forks source link

Add Julia bindings #23

Open Luthaf opened 1 year ago

Luthaf commented 1 year ago

This will require multiple steps:

cortner commented 1 year ago

I have a prototype implementation, which - on small basis sizes - is about a factor 2-3 faster than my previous one based on spherical coords (for large bases they are about the same) and will make this public soon once I also have the rrules done.

One could argue having the pure julia implementation here in this repo instead of a separate Julia repo. I don't particularly mind either way. We will most likely want to maintain our own implementation with the interfaces we need but if that can be made compatible with what is here then all the better.

For sure I wouldn't mind getting input on performance optimizations from people who actually understand this rather than just trial-and-error.

I also don't plan on a GPU implementation immediately so that would be another thing where I'd be interested in following somebody else.

cortner commented 1 year ago

An interesting complication is that a @generated julia implementation is really extremely fast, but then I can't exploit batched evaluation via AVX or SIMD. So I'm planning to provide two codes, one that is @generated for a single input and outputs an SVector and a second one that takes the loop of inputs inside and SIMD or AVXs it (to be tested which is faster on which hardware, I'm only working with M1 right now ...)

ceriottm commented 1 year ago

If you can / are willing to "harmonize" the API, I think it'd be interesting to have it in the same repo. The idea here is to re-use as much code as possible (so that later on, if there's a need) one can get crazy with optimizations, and/or provide more hardware-specific back-ends, but I can see the argument for being somehow less tightly bound for Julia.

cortner commented 1 year ago

OkSo the procedure would be have a Julia package live in a subfolder I think that will then be registered with the Julia General registry. I haven't done this before but maybe it's obvious and if not maybe @Luthaf can help if he is interested in having this?

One other caveat is that the Julia code should probably be MIT licensed.

Regarding different hardware -- In Julia that would probably be achievable via KernelAbstractions.jl later.

ceriottm commented 1 year ago

Yeah let's hear from @Luthaf and @frostedoyster . I think it'd be really good to consolidate the infrastructure we all use, and this is an easy starting point. When I was mentioning "harmonizing API" I meant just making sure function/class arguments and defaults are consistent, so users don't get too confused.

We have an ongoing offline discussion as to whether we should default to solid harmonics or normalize-by-default.

cortner commented 1 year ago

We have an ongoing offline discussion as to whether we should default to solid harmonics or normalize-by-default

I guess for large maxL it won't matter anyhow but for small maxL it might have an impact on performance.

For modelling I would want to provide both options - it's just and easy extra transformation layer. For performance optimization if you already have a splined Rnl basis then it seems pretty obvious?

Luthaf commented 1 year ago

So the procedure would be have a Julia package live in a subfolder I think that will then be registered with the Julia General registry.

This should be fine, the web interface in https://juliahub.com/ui/Packages can take a directory for packages that live in sub-directories. I'm not sure if the @JuliaRegistrator bot can also do this, but there should be a way to register the package.

One other caveat is that the Julia code should probably be MIT licensed.

Sure, we could also dual license the whole thing Apache/MIT (I know this is popular in the Rust ecosystem, but I don't remember why, I'll check).

One could argue having the pure julia implementation here in this repo instead of a separate Julia repo.

Especially if you already have a prototype, I'll be happy with also having a pure Julia implementation. If we also do bindings to the native implementation, we can have something like this in the long term

sph = spherical_harmonics(xyz; backend=:Julia) # pure julia, support autodiff & multiple dispatch

sph = spherical_harmonics(xyz; backend=:Native)  # native code, potentially faster, support fewer features
ceriottm commented 1 year ago

I guess for large maxL it won't matter anyhow but for small maxL it might have an impact on performance.

For modelling I would want to provide both options - it's just and easy extra transformation layer. For performance optimization if you already have a splined Rnl basis then it seems pretty obvious?

I agree entirely that for performance there's no question it should be the solid harmonics, and that it's no biggie to provide both options. the discussion was about the default. I feel quite strongly that the splid harmonics are actually the more natural things and it's ok to push for the better option as default, but @frostedoyster has a (good) argument that probably most users will expect the normalized version. Anyway, for the moment it should def be default normalize=False for consistency - I was just curious to hear your opinion.

cortner commented 1 year ago

I feel quite strongly that the splid harmonics are actually the more natural things and it's ok to push for the better option as default

I'm coming around to that perspective as well.

but @frostedoyster has a (good) argument that probably most users will expect the normalized version.

In my Julia codes I always provide solid_harmonics and spherical_harmonics (named a bit differently but that's the gist). I'm not sure why you wouldn't want to just supply both? Let people choose which one to use?

EDIT: I looked at your Python docs again. Why not create two classes, one called SphericalHarmonics the other SolidHarmonics - that would make it crystal clear. The keyword normalized=true is actually very confusing. It could be an option to normalize the basis in different ways (there are different conventions in different fields...)

cortner commented 1 year ago

I didn't yet test on the same CPU so maybe premature, but I think the Julia codes might be in a similar ballpark as yours. I am guessing I'm somewhere in-between your general-purpose and hard-coded. This is on a 2-year old M1. I tried to run the benchmarks as described on your readme but got no useful output, only red ...

Generated: (this is just basic meta-programming, no computer-algebra optimized thing)

[ Info: static_solid_harmonics
L = 1
  2.208 ns (0 allocations: 0 bytes)
L = 2
  4.708 ns (0 allocations: 0 bytes)
L = 3
  7.925 ns (0 allocations: 0 bytes)
L = 4
  11.970 ns (0 allocations: 0 bytes)
L = 5
  34.073 ns (0 allocations: 0 bytes)
L = 6
  40.575 ns (0 allocations: 0 bytes)

And the dynamic but batched implementation: (vectorized, but no multi-threading yet)

[ Info: batched evaluation vs broadcast
[ Info: nX = 32 (try a nice number)
L = 3
    single:   7.758 ns (0 allocations: 0 bytes)
broadcast!:   248.021 ns (0 allocations: 0 bytes)
   batched:   396.455 ns (0 allocations: 0 bytes)
L = 6
    single:   39.060 ns (0 allocations: 0 bytes)
broadcast!:   887.146 ns (0 allocations: 0 bytes)
   batched:   823.037 ns (0 allocations: 0 bytes)
L = 9
    single:   67.092 ns (0 allocations: 0 bytes)
broadcast!:   1.692 μs (0 allocations: 0 bytes)
   batched:   1.479 μs (0 allocations: 0 bytes)
L = 12
    single:   82.467 ns (0 allocations: 0 bytes)
broadcast!:   2.708 μs (0 allocations: 0 bytes)
   batched:   2.329 μs (0 allocations: 0 bytes)
L = 15
    single:   149.455 ns (0 allocations: 0 bytes)
broadcast!:   4.786 μs (0 allocations: 0 bytes)
   batched:   3.615 μs (0 allocations: 0 bytes)

TBH - I wouldn't mind feedback if anybody in your group is willing to look at this. I could make WIP-PR before I have the gradients finished.

Luthaf commented 1 year ago

A WIP PR sounds like a good idea! I don't have a lot of time to look at it over the next 2 weeks, but @frostedoyster can also give the code a look!

frostedoyster commented 1 year ago

I don't have much experience with Julia, but I'd be happy to take a look

cortner commented 1 year ago

Thanks - I'll prepare the PR. I propose to call the subfolder SpheriCart.jl which is the Julia naming convention since SpheriCart will be the name of the package in the Julia package registry.

If you have any concerns about it let me know.

frostedoyster commented 1 year ago

Perfect! I'm looking forward to the PR. Thank you also for the feedback on normalized. That's something we took from e3nn, but perhaps it isn't the best long-term solution. What are your feelings @ceriottm @Luthaf?

cortner commented 1 year ago

Hm - maybe one more question. So far I haven't looked at your code at all to avoid license problems. Going forward, if we were to think about optimizing maybe this would become useful. This is especially true to extend the Julia codes to GPU. Does Apache-2 allow me to produce a derived product and license it under MIT? If not, then I would need your explicit permission to do so.

ceriottm commented 1 year ago

Re license I don't mind dual license. Also better to do it now than after we have a bunch of contributors we can't track.

Re the normalization, I suggest that for the moment we adapt the Julia API to what we have, and open an issue to refactor. As @frostedoyster says, that's a lot of refactoring and re documenting so maybe not top priority.

cortner commented 1 year ago

I suggest that for the moment we adapt the Julia API to what we have

Initially I'm just going to do SolidHarmonics. I'll add the wrapper for SphericalHarmonics asap. I'm not convinced the two APIs need to be identical. The two languages are too different.

cortner commented 1 year ago

Re license I don't mind dual license. Also better to do it now than after we have a bunch of contributors we can't track.

My question was more specific : I thought I'd need explicit permission from the copyright holder if I look at your code, "transpile" it and release as new code under a different license. But it seems Apache-2 is sufficiently permissive that it doesn't matter:

The Apache 2.0 License is permissive. It allows you to use, modify, and distribute the licensed software, including creating derivative works, without requiring those derivative works to be licensed under the same terms. You can release the modified parts of the code under any license you prefer

But let me know if I've misunderstood it.