cavalab / srbench

A living benchmark framework for symbolic regression
https://cavalab.org/srbench/
GNU General Public License v3.0
203 stars 74 forks source link

add methods #4

Open lacava opened 5 years ago

lacava commented 5 years ago

add SR methods for comparison. the following come to mind:

jmmcd commented 5 years ago

Another one that would be very nice is PGE (Worm & Chiu, GECCO 2013).

Paper: http://seminars.math.binghamton.edu/ComboSem/worm-chiu.pge_gecco2013.pdf Code: https://github.com/verdverm/pypge

lacava commented 5 years ago

thanks, I'll reach out. doesn't look like it's being maintained.

folivetti commented 3 years ago

if I may, my algorithm was just accepted for publication:

Paper: https://www.mitpressjournals.org/doi/abs/10.1162/evco_a_00285 Code: https://github.com/folivetti/ITEA

Even though the code is in Haskell, I have included a Python wrapper in my repository, similar to your wrappers. Let me know if I can be of any help!

lacava commented 3 years ago

hi @folivetti , thanks for sharing. I'm going to be uploading a contributing guide soon that will detail how to include your method. Please stay tuned.

lacava commented 3 years ago

hi @folivetti, please see the contributing guide on the dev branch: https://github.com/EpistasisLab/regression-benchmark/blob/dev/CONTRIBUTING.md

Eventually this will be merged into master (still working on some hiccups with existing methods), but if you would like to start now, you can issue a PR to contribute your method to the dev branch. Let me know if you have any questions!

folivetti commented 3 years ago

thanks! I guess my code is already halfway through. As soon as I get the time to do so, I'll make the PR.

folivetti commented 3 years ago

I finally took the time to implement the python wrapper for ITEA. I have one final question: my code is written in Haskell using stack as the project manager. Should I include the installation of stack into the install script or should I put this requirement in a README file?

To install the stack environment you only need to run curl -sSL https://get.haskellstack.org/ | sh, but it may require a sudo permission since it installs GMP.

lacava commented 3 years ago

Sounds great. The entire install needs to be automated, so yes, any installation requirement needs to be implemented. When the code is set and you issue a PR, the repo is set up to test to make sure the installation passes and a mini benchmark runs without error.

On Sun, Mar 21, 2021, 4:40 PM Fabricio Olivetti de Franca < @.***> wrote:

I finally took the time to implement the python wrapper for ITEA. I have one final question: my code is written in Haskell using stack as the project manager. Should I include the installation of stack into the install script or should I put this requirement in a README file?

To install the stack environment you only need to run curl -sSL https://get.haskellstack.org/ | sh, but it may require a sudo permission since it installs GMP.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/EpistasisLab/regression-benchmark/issues/4#issuecomment-803656304, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSGCGX7637QNJQDCWYQOUDTEZKSHANCNFSM4GYXGDNQ .

lacava commented 3 years ago

To install the stack environment you only need to run curl -sSL https://get.haskellstack.org/ | sh, but it may require a sudo permission since it installs GMP.

sudo is ok

also wanted to mention you can test the install locally, doing something like

./configure
./install 
cd experiment
python -m pytest -v 

also see the github workflow for more info

lacava commented 3 years ago

@folivetti hope you got my email, but just checking if you think you'll have time to get ITEA integrated this week? many thanks!

folivetti commented 3 years ago

@folivetti hope you got my email, but just checking if you think you'll have time to get ITEA integrated this week? many thanks!

yes, I did receive the e-mail, thanks :-) I have everything ready and should make the PR tomorrow. I'm just running some tests to double check that everything works. Thanks!

gkronber commented 3 years ago

I'm adding a few more methods for future reference.

While it would be nice to have a transparent and objective way to compare all those methods it will probably be impossible to have all SymReg methods included into srbench for various reasons (e.g. closed source, difficulty to provide a Python wrapper, method is tuned to work well for certain problem characteristics, authors not cooperative, ...).

Researchers publishing SymReg methods should be made aware of srbench. I argue that we should be increasingly careful about new SymReg methods which are not included in srbench when reviewing or reading papers even when they are published in reputable journals.

lacava commented 3 years ago

Thanks for the list @gkronber. Deep SR is implemented and i'm working on AI-Feynman

MilesCranmer commented 2 years ago

Hi @lacava et al., thanks for making this benchmark suite, it looks great! I just found out about your efforts on this today, I think it is a great idea.

I would be interested in helping add my methods: the Julia library SymbolicRegression.jl (mentioned in @gkronber's post) and the Python frontend PySR which I actively maintain. Before I get started, just to check, would it be doable to include Julia as part of the benchmarking script?

Second, what kinds of resources are available for the benchmark? My library tends to find better results the longer it's run for and can be parallelized over multi-node.

Third, my methods output a list of equations rather than a single one. Is there I way I can pass the entire list through, or should I make a choice of one equation to pass?

Lastly, I was wondering about benchmark coverage: I have a "high-dimensional" SR method described a bit here (https://arxiv.org/abs/2006.11287) which is made for sequences, sets, and graphs. Is there a benchmark included here for high-dimensional SR?

Thanks! Miles

lacava commented 2 years ago

Hi @lacava et al., thanks for making this benchmark suite, it looks great! I just found out about your efforts on this today, I think it is a great idea.

Great! Thanks for reaching out.

I would be interested in helping add my methods: the Julia library SymbolicRegression.jl (mentioned in @gkronber's post) and the Python frontend PySR which I actively maintain. Before I get started, just to check, would it be doable to include Julia as part of the benchmarking script?

We should definitely be able to support Julia. It will be easiest if there is a conda dependency for it. But we also are moving towards a Docker environment eventually.

Second, what kinds of resources are available for the benchmark? My library tends to find better results the longer it's run for and can be parallelized over multi-node.

In our current experiment (Table 2) we set the termination criteria to 500k evaluations per training or 48 hours for the real-world datasets, and 1M evaluations or 8 hours for the synthetic ground-truth datasets.

Most of the methods here are parallelizable, but because we're running 252 datasets, 10 trials, and 21 methods, it made more sense to give each a single core. The cluster we used has ~1100 CPU cores.

Third, my methods output a list of equations rather than a single one. Is there I way I can pass the entire list through, or should I make a choice of one equation to pass?

Only a single final model should be returned. Otherwise it wouldn't be a fair comparison since your method would have several chances to "win". (Incidentally, most of the GP-based SR methods also have a set of models, and use a hold-out set for final model selection. We could think about ways of comparing sets of equations in the future, but don't do so right now.)

Also, it would be ideal to return the equation string in sympy-compatible format to avoid a lot of post-processing from the last round.

Lastly, I was wondering about benchmark coverage: I have a "high-dimensional" SR method described a bit here (https://arxiv.org/abs/2006.11287) which is made for sequences, sets, and graphs. Is there a benchmark included here for high-dimensional SR?

Currently we've mostly look at tabular data. Have a look at the datasets in PMLB. the widest datasets are ~100s of features. But, we're always looking for good benchmark problems.

MilesCranmer commented 2 years ago

Thanks! This is very helpful.

I have a quick followup question about the suite. Are you benchmarking accuracy, or parsimony, or some combination? Or are you evaluating whether the recovered sympy expression is equal? PySR's default choice for "best" is similar to Eureqa, where they look for "cliffs" in the accuracy-vs-parsimony curve.

Also - final question - can the model use a different set of hyperparameters for the noisy vs non-noisy dataset? (e.g., to simulate whether the experimenter would know a priori if their data was noisy).

Thanks again, Miles

lacava commented 2 years ago

Are you benchmarking accuracy, or parsimony, or some combination? Or are you evaluating whether the recovered sympy expression is equal? PySR's default choice for "best" is similar to Eureqa, where they look for "cliffs" in the accuracy-vs-parsimony curve.

It's probably worth checking details in the paper. we broke the comparison into real-world/"black-box" problems with no known model, and ground-truth problems generated for known functions. we benchmark accuracy and parsimony in the former case and symbolic equivalence (within a linear transformation of the true model) in the latter.

Also - final question - can the model use a different set of hyperparameters for the noisy vs non-noisy dataset? (e.g., to simulate whether the experimenter would know a priori if their data was noisy).

We don't support this at the moment, but most of the benchmarks have some amount of noise. One of our study findings was that AI-Feynman was particularly sensitive to target label noise.

MilesCranmer commented 2 years ago

Added PySR and SymbolicRegression.jl in this PR: #62. Let me know what else I need to add, thanks!

fnpdaml commented 1 year ago

add SR methods for comparison. the following come to mind:

Also please consider: "TuringBot" https://turingbotsoftware.com/ (Free version is limited to max. 50 rows of input data and max. 3 variables) nevertheless, best success ratio from my empirical, personal usage.

from the documentation: "TuringBot is also a console application that can be executed in a fully automated and customizable way" https://turingbotsoftware.com/documentation.html#command-line

Again, I have no relation to the authors and/or copyright holders. Cheers.