Open eladmw opened 3 years ago
Great idea. Any specific tests you would be interested in?
I used to actually use gplearn, but the pure-Python/numpy performance was never enough for any of the problems I work on in my research. The whole reason I wrote PySR was so I could get the SR performance I need for my projects. (Eureqa is still the fastest GA-based option out there, but it's commercialized and online-only now, and without a Python API, which makes experiments hard to run, so I've actually stopped using it in favor of PySR).
I think a lot of the difference between the packages comes from that DEAP (backend of gplearn) is pure-Python, whereas the entire search here is compiled end-to-end, and is asynchronously distributed. And also a few other optimizations introduced here that are specific to symbolic regression (like the constant tuning), which I'm not sure is available in DEAP.
I can start by putting this in the README (or a separate repo for benchmarks?), but I'd also eventually like to write this up somewhere!
Cheers, Miles
For a benchmark, I would say : speed, performance, features, and simplicity
FIY, I've been using both GPLearn and PySR and in for the cases I considerd PySR was both faster and more accurate.
I found as well this paper that compares PySR and GPLearn and other methods as well.
This issue is from 2021 so just to mention that nowadays there’s also the PySR paper itself: https://arxiv.org/abs/2305.01582 which does a bunch of benchmarking and other comparison.
Sadly that paper doesn’t include PySR in the benchmark 😞
Can you do a basic comparison between this and gplearn with regards to speed and flexibility?