MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia
https://astroautomata.com/PySR
Apache License 2.0
2.32k stars 211 forks source link

[BUG]: Can't pickle greater: attribute lookup greater on __main__ failed #588

Closed tbuckworth closed 6 months ago

tbuckworth commented 6 months ago

What happened?

model.fit fails due to pickle error when using binary operator "greater".

Here is a minimal example:

import numpy as np
from pysr import PySRRegressor

if __name__ == "__main__":
    x = np.random.uniform(-1, 1, size=100).reshape((50, 2))
    y = x[:, 1] ** 2
    model = PySRRegressor(
        equation_file="symbreg/symbreg.csv",
        niterations=1,
        binary_operators=["greater"],
        elementwise_loss="loss(prediction, target) = (prediction - target)^2",
    )
    model.fit(x, y)

If I replace "greater" with "cond" then no error is thrown. I've tried on different datastets etc., but if 'greater' is used in an equation, then this error is thrown.

Python=3.8 pysr=0.17.2

Version

0.17.2

Operating System

Linux

Package Manager

pip

Interface

Script (i.e., python my_script.py)

Relevant log output

Compiling Julia backend...
[ Info: Started!
0.0%┣                                               ┫ 0/15 [00:00<00:00, -0s/it]Expressions evaluated per second: [.....]. Head worker occupation: 0.0%         Press 'q' and then <enter> to stop execution early.                             Hall of Fame:                                                                   ---------------------------------------------------------------------------------------------------                                                             Complexity  Loss       Score     Equation                                       1           8.819e-02  1.594e+01  y = 0.34869                                   ---------------------------------------------------------------------------------------------------
20.0%┣█████████▏                                    ┫ 3/15 [00:00<00:01, 15it/s]Expressions evaluated per second: [.....]. Head worker occupation: 58.2%. This is high, and will prevent efficient resource usage. Increase `ncyclesperiteration` to reduce load on head worker.                                                Press 'q' and then <enter> to stop execution early.                             Hall of Fame:                                                                   ---------------------------------------------------------------------------------------------------                                                             Complexity  Loss       Score     Equation                                       1           8.819e-02  1.594e+01  y = 0.34869                                   9           6.944e-02  2.989e-02  y = greater(0.7405, greater(greater(0.7405, x₁), greater(-0.71...                                                                                               281, x₁)))                                    19          6.928e-02  2.238e-04  y = greater(greater(greater(greater(-0.40062, -1.2073), greate...                                                                                               r(x₀, x₀)), 0.7405), greater(greater(0.7405, greater(x₁, 0.695...                                                                                               09)), greater(-0.71281, x₁)))                 ---------------------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/titus/PycharmProjects/train-procgen-pytorch/venv/lib/python3.8/site-packages/pysr/sr.py", line 1112, in _checkpoint
    pkl.dump(self, f)
_pickle.PicklingError: Can't pickle greater: attribute lookup greater on __main__ failed

Extra Info

Someone here fixed a similar issue with this help:

The problem is that you're trying to pickle an object from the module where it's defined. If you move the Nation class into a separate file and import it into your script, then it should work.

tbuckworth commented 6 months ago

I updated to pysr==0.18.1, but the problem persists

MilesCranmer commented 6 months ago

That is weird, it seems like greater is missing its sympy mapping:

https://github.com/MilesCranmer/PySR/blob/09bfff6a4599244e2653cdebf70885c48ad9d864/pysr/export_sympy.py#L7-L57

So you could pass this to extra_sympy_mappings of the PySRRegressor, like

extra_sympy_mappings={"greater": lambda x, y: sympy.Piecewise((1.0, x > y), (0.0, True))}

but ideally we should have it built-in since greater is documented as an available operator.

tbuckworth commented 6 months ago

Brilliant! That fixed it, thank you