MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia
https://astroautomata.com/PySR
Apache License 2.0
2.09k stars 197 forks source link

Benchmark / GPlearn #39

Open eladmw opened 3 years ago

eladmw commented 3 years ago

Can you do a basic comparison between this and gplearn with regards to speed and flexibility?

MilesCranmer commented 3 years ago

Great idea. Any specific tests you would be interested in?

I used to actually use gplearn, but the pure-Python/numpy performance was never enough for any of the problems I work on in my research. The whole reason I wrote PySR was so I could get the SR performance I need for my projects. (Eureqa is still the fastest GA-based option out there, but it's commercialized and online-only now, and without a Python API, which makes experiments hard to run, so I've actually stopped using it in favor of PySR).

I think a lot of the difference between the packages comes from that DEAP (backend of gplearn) is pure-Python, whereas the entire search here is compiled end-to-end, and is asynchronously distributed. And also a few other optimizations introduced here that are specific to symbolic regression (like the constant tuning), which I'm not sure is available in DEAP.

I can start by putting this in the README (or a separate repo for benchmarks?), but I'd also eventually like to write this up somewhere!

Cheers, Miles

eladmw commented 3 years ago

For a benchmark, I would say : speed, performance, features, and simplicity

LucasPa commented 1 month ago

FIY, I've been using both GPLearn and PySR and in for the cases I considerd PySR was both faster and more accurate.

I found as well this paper that compares PySR and GPLearn and other methods as well.

MilesCranmer commented 1 month ago

This issue is from 2021 so just to mention that nowadays there’s also the PySR paper itself: https://arxiv.org/abs/2305.01582 which does a bunch of benchmarking and other comparison.

yfflood commented 20 hours ago

I found a paper on benchmarking SR methods and also the corresponding repo SRBench, maybe this will help with comprehensive evaluation:)

MilesCranmer commented 6 hours ago

Sadly that paper doesn’t include PySR in the benchmark 😞