heal-research / pyoperon

Python bindings and scikit-learn interface for the Operon library for symbolic regression.
MIT License
34 stars 10 forks source link

Segmentation fault when working with MultiEvaluator on Mac M1 #15

Closed kdesoto-astro closed 2 months ago

kdesoto-astro commented 2 months ago

Hi,

I've been able to isolate a segmentation fault to the use of Operon.MultiEvaluator, specifically when you add more than one evaluator to the MultiEvaluator object.

Simple reproducible example (replacing the Operon.Evaluator() definition in example/operon-bindings.py ):

evaluator = Operon.MultiEvaluator(problem)
for i in range(2): # works fine if changed to range(1)
    evaluator_i      = Operon.Evaluator(problem, dtable, error_metric, True) # initialize evaluator, use linear scaling = True
    evaluator_i.Budget = 1000 * 1000             # computational budget
    optimizer      = Operon.LMOptimizer(dtable, problem, max_iter=3)
    evaluator_i.Optimizer = optimizer
    evaluator.Add(evaluator_i)

aggregateEvaluator = Operon.AggregateEvaluator(evaluator)
aggregateEvaluator.AggregateType = Operon.AggregateType.Max

# define how new offspring are created
generator      = Operon.BasicOffspringGenerator(aggregateEvaluator, crossover, mutation, selector, selector)

Not sure if this bug appears on Linux machines as well. Segfault still occurs when not applying aggregateEvaluator and instead feeding evaluator directly into the last line, so the issue does not seem to be with AggregateEvaluator. Using Python 3.11, and working on a MacBook M1 Pro with Sonoma 14.5 Beta. Installed using git clone + pip instructions.

foolnotion commented 2 months ago

Hi,

This is a lifetime issue. You are creating the evaluator and optimizer instances inside the loop, so their lifetime is limited to the body of the loop. These objects will already be 'dead' at the time when the GP algorithm needs to use them.

One possible workaround is to extend their lifetime by putting them in a list:

instances_holder = []
for i in range(2): # works fine if changed to range(1)
    evaluator_i    = Operon.Evaluator(problem, dtable, error_metric, True) # initialize evaluator, use linear scaling = True
    evaluator_i.Budget = 1000 * 10000             # computational budget
    optimizer      = Operon.LMOptimizer(dtable, problem, max_iter=3)
    evaluator_i.Optimizer = optimizer
    evaluator.Add(evaluator_i)
    instances_holder.append((evaluator_i, optimizer))
folivetti commented 2 months ago

if you're still having trouble (and assuming you're testing multiview SR), have a look at:

https://github.com/erusseil/MvSR-analysis/blob/main/mvsr.py#L115 https://github.com/erusseil/MvSR-analysis/issues/5#issuecomment-1961015836

kdesoto-astro commented 2 months ago

Hi,

Thanks for the quick responses - the instances_holder solution worked! Previously I was just saving the evaluators to a list (following the mvsr.py script), but also saving the optimizers seems to have fixed the issue. Thank you so much!