axkr / symja_android_library

:coffee: Symja - computer algebra language & symbolic math library. A collection of popular algorithms implemented in pure Java.
https://matheclipse.org/
GNU General Public License v3.0
359 stars 81 forks source link

Symbolic Regression (based on pysr library) #850

Open InfinityPlusPlus opened 7 months ago

InfinityPlusPlus commented 7 months ago

Is your feature request related to a problem? Please describe. Its not related to any problem. Its just a suggestion, to add symbolic regression, to fit x-y values, symbolically

Describe the solution you'd like A java alternative of PySR library: https://github.com/MilesCranmer/PySR

If necessary discuss the feature in the Discord Symja chat.

ogeagla commented 4 months ago

This isn't quite a library (yet), but I think this symbolic regression tool I built for fun has some overlap with this issue. It heavily relies on Symja. Also, it's not exactly a "Java alternative", it's written Clojure. Though if I package it as a library, any JVM language should potentially be able to use it as a dependency. https://github.com/ogeagla/closyr

axkr commented 4 months ago

This isn't quite a library (yet), but I think this symbolic regression tool I built for fun has some overlap with this issue. It heavily relies on Symja. Also, it's not exactly a "Java alternative", it's written Clojure. Though if I package it as a library, any JVM language should potentially be able to use it as a dependency. https://github.com/ogeagla/closyr

Can you tell us some more context about your library? I assume your library implements something like the FindFormula in Mathematica?

and not something like the FindFit function, where the parameters of the functions will be determined:

Would be nice, if we can extract a FindFormula function from your library to avoid the overhead of the full Clojure libraries.

ogeagla commented 4 months ago

Thanks for the great feedback!

Context: I find symbolic regression personally interesting, and as I was building this and reading the docs in this repo and saw this ticket and thought I'd mention. I don't have any particular objectives for this work besides learning.

You are correct that the linked project provides functionality matching FindFormula, though it would not be a huge lift to also provide FindFit (essentially hold the GA candidate population to the functional form provided by user, and use mutations that only modify the params, not the functional form).

I think it'd be fun to try to extract a FindFormula function. Would I be right in assuming that you'd consider a PR only if the dependency was pure Java? What if the library internals were in Clojure but exposed a Java API? There would still be an overhead of packaging the Clojure runtime in the dependency, but wasn't sure how much that matters here. Totally understand if you wouldn't want the couple MB + bringing in an entire runtime just for one math function, though.

axkr commented 4 months ago

Would I be right in assuming that you'd consider a PR only if the dependency was pure Java? What if the library internals were in Clojure but exposed a Java API? There would still be an overhead of packaging the Clojure runtime in the dependency, but wasn't sure how much that matters here. Totally understand if you wouldn't want the couple MB + bringing in an entire runtime just for one math function, though.

yes a Java implementation should be preferred. Should I add a template for implementing FindFormula? In a PR you can first setup a call to the Clojure implementation and port that to Java?

ogeagla commented 4 months ago

A template would be awesome, and your approach of Clojure -> Java path sounds good.

Was looking through the impl of FindFit to get a sense of how best to do this. Analogous to how FindFit uses LM, are you open to FindFormula using a Genetic Algorithm? I'd assume yes, as that is a somewhat standard approach, but wanted to run that by you. My implementation uses GAs.

Another consideration to me would be around execution context. This problem is embarrassingly parallel and my implementation leans into that to get reasonable performance. Does the symja have a thread pool or executor service that can be provided to math functions' internals? If there's an example in the codebase already, that's all I'd need to get started on that aspect.

axkr commented 4 months ago

In c9a2485 I prepared a template for FindFormula.

At the start a Genetic Algorithm is OK. If needed we can implement "Options" to control the algorithm.

Maybe we must also implement something like Parallelize? https://reference.wolfram.com/language/ref/Parallelize.html

but in the first step you can create ExecutorService with java.util.concurrent.Executors.

ogeagla commented 4 months ago

Thanks, found the template, appreciate it. Will peek at Parallelize.

Where in the test path (matheclipse-core/src/test/java/) might I add a test for this?

axkr commented 4 months ago

See commit: