MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia
https://astroautomata.com/PySR
Apache License 2.0
2.09k stars 197 forks source link

[BUG]: EXCEPTION_ACCESS_VIOLATION during garbage collection in PySR #661

Open zzccchen opened 4 days ago

zzccchen commented 4 days ago

What happened?

The program crashed while using PySR, with an error message indicating a memory access violation (EXCEPTION_ACCESS_VIOLATION). This error occurred during the garbage collection process.

Version

v0.19.0

Operating System

Windows

Package Manager

pip

Interface

Script (i.e., python my_script.py)

Relevant log output

[ Info: Automatically setting `--heap-size-hint=2730M` on each Julia process. You can configure this with the `heap_size_hint_in_bytes` parameter.
[ Info: Importing SymbolicRegression on workers as well as extensions Bumper, LoopVectorization.
[ Info: Finished!
[ Info: Copying definition of loss_fnc to workers...
[ Info: Finished!
[ Info: Started!
32.1%┣█████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                                                                                                                                                                                      ┫ 1.0k/3.2k [00:40<01:26, 25it/s]
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ffa6106a6b0 -- gc_mark_outrefs at C:/workdir/src\gc.c:2527 [inlined]
gc_mark_and_steal at C:/workdir/src\gc.c:2746
in expression starting at none:0---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
gc_mark_outrefs at C:/workdir/src\gc.c:2527 [inlined]
gc_mark_and_steal at C:/workdir/src\gc.c:2746
gc_mark_loop_parallel at C:/workdir/src\gc.c:2885
jl_gc_mark_threadfun at C:/workdir/src\partr.c:142
uv__thread_start at /workspace/srcdir/libuv\src/win\thread.c:111
beginthreadex at C:\Windows\System32\msvcrt.dll (unknown line)
endthreadex at C:\Windows\System32\msvcrt.dll (unknown line)
BaseThreadInitThunk at C:\Windows\System32\KERNEL32.DLL (unknown line)
RtlUserThreadStart at C:\Windows\SYSTEM32\ntdll.dll (unknown line)
Allocations: 9815735891 (Pool: 9517376769; Big: 298359122); GC: 69400

Extra Info

turbo=True, bumper=True

MilesCranmer commented 4 days ago

Can you try with turbo=False, bumper=False? Those options are experimental and get PySR to use libraries which are bleeding edge. When they work, they are really fast, but they can also cause crashes (especially on Windows).

zzccchen commented 17 hours ago

Regrettably. I tried turbo=False, bumper=False parameter and the crash problem still occurred.

zzccchen commented 17 hours ago

Could automatically setting --heap-size-hint=2730M cause this problem?

MilesCranmer commented 15 hours ago

Hm, Can you show the rest of your code?

zzccchen commented 13 hours ago
from pysr import PySRRegressor

# data load code

X_123e = data_X_123e.to_numpy()
y_123e = data_y_123e.to_numpy()

sr_model = PySRRegressor(
    binary_operators=[
        "*",
        "+",
        "-",
        "/",
    ],
    unary_operators=["square", "cube", "exp", "log", "sqrt"],
    maxsize=80, 
    maxdepth=10,  
    niterations=100, 
    populations=32, 
    population_size=100, 
    ncycles_per_iteration=550, 
    constraints={
        "/": (-1, 9),
        "^": (-1, 5),
        "exp": 6,
        "square": 6,
        "cube": 6,
        "log": 6,
        "sqrt": 6,
        "abs": 9,
    },
    nested_constraints={
        "square": {"square": 0, "cube": 0, "exp": 1},
        "cube": {"square": 0, "cube": 0, "exp": 1},
        "exp": {"square": 0, "cube": 0, "exp": 0},
        "sqrt": {"sqrt": 0, "log": 0},
        "log": {"log": 0},
    },
    complexity_of_operators={
        "square": 2,
        "cube": 3,
        "exp": 3,
        "log": 3,
        "sqrt": 2,
    },
    complexity_of_constants=4,
    adaptive_parsimony_scaling=150.0,
    weight_add_node=0.79,
    weight_insert_node=5.1,
    weight_delete_node=1.7,
    weight_do_nothing=0.21,
    weight_mutate_constant=0.048,
    weight_mutate_operator=0.47,
    weight_swap_operands=0.1,
    weight_randomize=0.23,
    weight_simplify=0.5,
    weight_optimize=0.5,
    crossover_probability=0.066,
    perturbation_factor=0.076,
    cluster_manager=None,
    precision=32,
    turbo=True,
    bumper=True,
    progress=True,
    elementwise_loss="""
    function loss_fnc(prediction, target)
        percentage_error = abs((prediction - target) / target) * 100
        return percentage_error
    end
    """,
    multithreading=False,
    equation_file=symbol_regression_csv_path,
)

complexity_of_variables = [] # list of complexity
sr_model.fit(
    X_123e, y_123e, complexity_of_variables=complexity_of_variables
)

here is the main code of the workflow.

zzccchen commented 13 hours ago

At the same time, I will put the above code in a multi-layer loop to test different feature data sets and the stability of the symbolic regression results. A single loop takes about 2.2 minutes. The program crashes after running for 3-4 hours, running about 80-110 rounds.

MilesCranmer commented 11 hours ago

That looks good. Great to see all those options being used! 🙂

(Random comment: your element wise loss divides by the target, so make sure the target > 0, otherwise one target will dominate. But I’m assuming you’re aware of that!)

Other comment: can you try with multithreading=True? With it set to False, and with procs>0 (the default), it will use multiple Julia processes. But if you just use multi-threading instead, it will start up much faster and hopefully be more stable. With multi-processing it is launching new Julia processes every single time it searches. (This is a weakness in the current codebase; I would like to eventually store the processes within PySRRegressor so multiprocessing has fast startup too.)

You can also set multithreading=False, procs=0 to use serial mode.

But it’s curious that it crashes. Since it runs for a few hours, did you notice anything else happening, like the memory usage gradually increasing over that time and not going down?

zzccchen commented 8 hours ago

If I use multithreading instead of multiprocessing, the calculation speed will drop from 30it/s to 7it/s on my device, which is a bit unacceptable to me. In addition, I have made sure that my y_true values ​​are all greater than 0. And the memory usage does not fluctuate when the program crashes, occupying only 30% of the total memory.

MilesCranmer commented 19 minutes ago

Maybe try multithreading=True again, but this time, before loading PySR, set a larger thread count:

import os
os.environ["PYTHON_JULIACALL_THREADS"] = (num_cores) * 2

Where num_cores is the number of CPU cores. The factor of 2 is so there’s some redundancy but you could try more or less depending on performance.

The default behavior of PySR is to start Julia with --threads='auto' which is actually fewer than the number of available cores (so it doesn’t take up the whole CPU). But for high performance you can increase the usage.

The full list of available juliacall environment variables is here: https://juliapy.github.io/PythonCall.jl/stable/juliacall/#julia-config