MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia
https://astroautomata.com/PySR
Apache License 2.0
2.11k stars 198 forks source link

Allow user to specify full objective functions #276

Closed MilesCranmer closed 1 year ago

MilesCranmer commented 1 year ago

Right now, a user can only specify a custom elementwise loss function. This change allows the user to customize the entire objective function, including:

Here's a cool example that let's you fix a symbolic form of an expression, as a rational function (i.e., $P(X)/Q(X)$ for polynomials $P$ and $Q$).

objective = """
function my_custom_objective(tree, dataset::Dataset{T}, options) where {T<:Real}
    # Require root node to be binary, so we can split it,
    # otherwise return a large loss:
    tree.degree != 2 && return T(10000)

    left = tree.l
    right = tree.r

    # Evaluate numerator:
    l_prediction, l_flag = eval_tree_array(left, dataset.X, options)
    !l_flag && return T(10000)

    # Evaluate denominator:
    r_prediction, r_flag = eval_tree_array(right, dataset.X, options)
    !r_flag && return T(10000)

    # Impose functional form:
    prediction = l_prediction ./ r_prediction

    return sum((prediction .- dataset.y) .^ 2) / dataset.n
end
"""

model = PySRRegressor(
    binary_operators=["*", "+", "-"],
    full_objective=objective
)

Note that the output equations will need to be parsed/interpreted manually, because the default printing scheme doesn't know about this custom symbolic manipulation. (The root node of ((x0 + 3.2) - 0.5) is -).

MilesCranmer commented 1 year ago

Would be great if there is a way to pass custom constraints as well.

Also - since this is allowing more complex functionality to be passed, it will be important to have a way of indicating to SymbolicRegression.jl which additional functions need to be passed to worker nodes.

MilesCranmer commented 1 year ago
dominik-rehse commented 1 year ago

Dear @MilesCranmer,

Thank you for implementing this awesome feature, which I saw only now!

Would this - at least in principle - allow for symbolic constraints on the functional form (e.g., $\lim_{x\to\infty} f(x)=0$)? If so, could you provide any hints on how to implement something like that? I am not really familiar with Julia and could not immediately find more information on symbolic operations in the docs of SymbolicRegression.jl.

Thanks, Dominik

MilesCranmer commented 1 year ago

Great question! Moving to #324 (improves discoverability)