JuliaPy / PythonCall.jl

Python and Julia in harmony.
https://juliapy.github.io/PythonCall.jl/stable/
MIT License
763 stars 62 forks source link

RFC: Syntax for Gradual Julia-ization of a Python library #521

Open MilesCranmer opened 2 months ago

MilesCranmer commented 2 months ago

I think the greatest thing about PythonCall/JuliaCall is how easy it makes it to integrate Julia into a Python project, so that one can gradually port the hot inner loops into Julia functions. I think it could be made even easier to do this, and wanted to share a couple of ideas.

1. Julia should be optional

Installing Julia on every user’s machine is a big ask for heavily-used Python libraries that have battle-tested build scripts. Therefore I think the process of Julia-izing a Python library should be very gradual, and allow for a Julia backend to be optional.

I think it would be nice if there was a standardized and documented way to check for the availability of juliacall without triggering an install of Julia, which would open up specific high-performance branches of code. For example,

if juliapkg.is_installed():
    # Julia branch of code
else:
    # regular code

In a package, you could create an optional “extra” set of dependencies which would install the juliacall library and also trigger the juliapkg.is_installed branches to become True.

I think this may be preferable in some cases to simply checking the presence of juliacall in the user’s environment which might be installed from another package. For this idea, the user would need to explicitly install the Julia backend with something like

pip install "mypackage[julia]"

For those checks to trigger.

Thus, if a package developer chooses to make this an option, users would need to opt-in to enable the faster behavior.

2. Syntax for Julia versions of functions

In a similar direction, I wonder if there is another syntax available for writing Julia versions of functions. One idea is to have something like @numba.jit, but with a Julia version, which could look like

@juliacall.pydef
def foo(x):
    return np.sum(x ** 2)

@juliacall.jldef(foo)
def foo_jl(x)
    return """
        x -> sum(xi -> xi^2, x)
    """

The jldef would run juliacall.seval on the return value of the Python code (a string), and feed the arguments of the function to the resulting anonymous function. This would be cached.

In addition, the jldef version would only be activated if a user installs the Julia backend of the package. The pydef version would check if it is installed, and call the Julia branch of the code, which gets associated using the jldef(foo) specification.

MilesCranmer commented 1 month ago

Trying to get some more ideas in this thread on discourse: https://discourse.julialang.org/t/gradual-julia-ization-of-python-libraries/117828/12?u=milescranmer (seems better for open discussion compared to GitHub issues).

One idea I am fond of is the following:

Say we have a file in our library file1.py:

import numpy as np
from juliacall import jldispatch

@jldispatch("file2.jl", function="foo2")
def foo(x):
    return np.sum(x ** 2)

where file2.jl would be in the same directory as file1.py, and have:

function foo2(x)
    return sum(xi -> x^2, x)
end

Here, basically jldispatch(file, func_sym) could attach the Julia function foo2 to the Python function foo, via:

def jldispatch(file, function):
    def apply(f):
        if juliapkg.isinstalled():
            jl.seval(file)
            return jl.seval(function)
        else:
            return f
    return apply

So this could let you easily attach Julia code to Python functions.

Perhaps this jldispatch could also have an extension argument which could associate the juliapkg.isinstalled() to a particular Python extension. That way, only pip install mypkg[julia] would set up the Julia acceleration.

@cjdoris what do you think of these?