cavalab / srbench

A living benchmark framework for symbolic regression
https://cavalab.org/srbench/
GNU General Public License v3.0
203 stars 74 forks source link

PySR parameters #84

Closed lacava closed 10 months ago

lacava commented 2 years ago

@MilesCranmer can you specify the set of 6 hyperparameters you would like to use for benchmarking PySR? Going to start the runs. Hoping to have these by the end of tomorrow. They should match the original constraints:

Currently hyperparameters are set small for testing, but evaluate_model.py now recognizes and shrinks some PySR parameters during testing. So the desired set should specified directly in the model file. (In the updated version for the competition, you can specify test_params explicitly).

This is the current version:

hyper_params = [
    {
        "annealing": (True,), # (True, False)
        "denoise": (True,), # (True, False)
        "binary_operators": (["+", "-", "*", "/"],),
        "unary_operators": (
            [],
            # poly_basis,
            # poly_basis + trig_basis,
            # poly_basis + exp_basis,
        ),
        "populations": (20,), # (40, 80),
        "alpha": (1.0,),
        "model_selection": ("best",)
        # "alpha": (0.01, 0.1, 1.0, 10.0),
        # "model_selection": ("accuracy", "best"),
    }
]
MilesCranmer commented 2 years ago

Hi @lacava,

Thanks for trying to integrate PySR into the benchmark. However I still need to add a mechanism in SymbolicRegression.jl for recording # of evaluations. This is tricky because PySR uses BFGS as part of the scalar constant search, so I need to figure out how to extract the number of evaluations used by the third-party BFGS implementation. Also, more generally, I have never tried tuning for # evaluations before, so will need to see how those change the optimal parameters compared to the current set which were tuned for speed alone. e.g., I am assuming things like algebraic simplification will be performed much more frequently since this doesn’t affect the # of evaluations.

I’m in job season until mid-April but will hopefully have time after that to make the required updates!

Thanks. Best, Miles

lacava commented 2 years ago

OK, sounds good. BTW, the competition will not have a max_evals limit, just max time, as per #67. Hope to have a PySr submission there!

MilesCranmer commented 2 years ago

Submitted the updated regressor in #114 (which has a max_evals parameter). Not the competition submission yet, will get that done later this week.

(500k evaluations finishes in about 2 seconds...)

MilesCranmer commented 2 years ago

Also - will the same model be used for all tests in one go, or will the file be restarted each test? I ask because model.fit saves state (so that you can call model.fit repeatedly to get improved results), so it is necessary to call model.reset() if the same model will be used.

lacava commented 10 months ago

out of date