Gradient/derivative implementation in loss functions

Jgmedina95 commented 2 years ago

Hi! I'm trying to implement a loss function that takes into account the derivative or gradient of the expressions as a kind of regularization term. Right now is not possible to implement this as the custom loss only takes the label and the predicted evaluated value. After understanding how the evaluation of the equation works and seeing the evaldiffTreeArray function that explicitly calculates the derivative I'm wondering if it can be done as follows:

I'm thinking of including a new Loss as follows, which includes the data

function Loss(cX::AbstractMatrix{T},x::AbstractArray{T}, y::AbstractArray{T}, options::Options{A,B,dA,dB,C},index::int)::T where {T<:Real,A,B,dA,dB,C<:Function}
    sum(options.loss.(cX, x, y))/length(y)
end

and the new loss expression could be expressed like for example: loss(cX,x,y) = RMS + evaldiffTreeArray(y, cX, options, i).

I'm still struggling with how to include the options in the function as it's both needed to run the function but can't be included as is in the custom loss function.

There could be a different approach, and I'm open to ideas too.

PD. this is different than the other issue about gradient in the loss function as I'm not intending to compare vs a known gradient, which would require changing the data structure.

MilesCranmer commented 2 years ago

Hi @Jgmedina95,

Thanks; this is a really interesting question. I think we can definitely get this to work.

So, first note that function Loss is only ever called from function EvalLoss. So I think for a customized loss function (aside from the usual tweaks), you would really want to change EvalLoss.

EvalLoss is written as follows: https://github.com/MilesCranmer/SymbolicRegression.jl/blob/7710b17f8efbaf52782b20309f49d25e44c3c645/src/LossFunctions.jl#L23-L35

For a single custom re-definition, you could simply tweak EvalLoss to your liking. For a more general approach, maybe a better strategy would be to add an option key to Options that allows one to pass a custom EvalLoss? This might look like:

# Evaluate the loss of a particular expression on the input dataset.
function EvalLoss(tree::Node, dataset::Dataset{T}, options::Options)::T where {T<:Real}
    if options.custom_loss_func !=== nothing
        return options.custom_loss_func(tree, dataset, options)
    end

    (prediction, completion) = evalTreeArray(tree, dataset.X, options)
    if !completion
        return T(1000000000)
    end

    if dataset.weighted
        return Loss(prediction, dataset.y, dataset.weights, options)
    else
        return Loss(prediction, dataset.y, options)
    end
end

So, you would define a function which you would then pass in as:

function my_loss(tree, dataset::Dataset{T}, options) where {T}
    prediction, gradient, complete = evalGradTreeArray(tree, dataset.X, options, variable=true)
    (!complete) && return T(10000000)
    predictive_loss = sum(abs.(dataset.y .- prediction)
    return predictive_loss + otherfunction(gradient)
end

options = Options(custom_loss_func=my_loss)

What do you think?

Keep in mind for this to work you need to use multithreading=true, procs=0 when launching. It's doable to get it working with multiprocessing, but just more effort.

Best, Miles

Jgmedina95 commented 2 years ago

Hi @MilesCranmer it's being a little while, I like your idea a lot! it looks simple to implement, although when I'm trying to implement it i run into some errors:

Error in implementation So, I changed the function options and the structure Options

function Options(;
    binary_operators::NTuple{nbin, Any}=(+, -, /, *),
    unary_operators::NTuple{nuna, Any}=(),
    constraints=nothing,
    loss=L2DistLoss(),
    custom_loss_func = nothing,

I made it the same type as loss,

    loss::C
    custom_loss_func::C
    progress::Bool```

Based on the change in EvalLoss it should run normally if I don't change cust_lossfunc=nothing, but when I try to run any example, without using the gradient, it throws the following error when I try to set the options.

```julia> options=SymbolicRegression.Options(binary_operators =(+,-,/,*,pow_abs),
       unary_operators=(exp,cos),
       npopulations =20);
ERROR: MethodError: Cannot `convert` an object of type Nothing to an object of type L2DistLoss`

it could be that at some point in the code it confuses loss with cust_loss_func?

Comment on the idea proposed Finally, my only comment on the idea would be that I would try to work with a function with evalDiffTree instead of the gradient for the reason that if the gradient doesn't "complete", let's say for variable x1, but my interest is only x2, then it would discard the function. Haven't got to the point to try it out, but thinking through it, the complete=false for the evalGradTree seems more probable than evalDiffTree as it lets me focus on only the variable(s) of interest.

Pd. sorry for the late response, it's been crazy times.

MilesCranmer commented 2 years ago

Ah, it looks like you are declaring custom_loss_func as the same type as loss (both have ::C in your example above). Since the default loss is L2DistLoss, it will try to assume that custom_loss_func is also L2DistLoss, hence the error!

I would make it a different type and it should work (e.g., ::D, and add D to the template code at the top where it says where {A,B,....

Alternatively, for testing purposes, you could simply declare custom_loss_func::Function, although it would be slightly slower so for the final code you'd want to properly template it. The ::C is basically used to specialize functions to the specific input function (like L2DistLoss), whereas ::Function keeps functions from compiling the loss function into the code.

MilesCranmer commented 1 year ago

Just to add: the PR #162 makes it easier to specify this.

MilesCranmer / SymbolicRegression.jl

Gradient/derivative implementation in loss functions #92