I was thinking about trying this method for a project, but the interface for running it (directories of text files and CSVs) seems pretty tricky to use / non-standard for Python users -- most people have their input data in numpy or pandas files, and don't want to parse through a complicated directory structure to inspect results.
It seems like it would be pretty straightforward, however, to create a scikit-learn-style wrapper class with a fit(X,y) method that:
takes in np.ndarrays for the inputs and target (and optionally operations + variable names as lists or tuples of strings)
creates a directory with all of these variables saved in the current format aifeynman supports
invokes aifeynman.run_aifeynman with the appropriate arguments + paths
exposes one-line helper methods and properties (on an instance of the wrapper class) to load different discovered equations from the results directory in a convenient form (e.g. giving SymPy/string expressions for different results along the Pareto frontier, as well as easy ways to evaluate them on test data)
Overall, I think creating an easy-to-use API for this method would do a ton to increase its impact. I was able to get other libraries like PySR and gplearn running in minutes, but this feels like it would take a good bit longer to figure out / set up.
I was thinking about trying this method for a project, but the interface for running it (directories of text files and CSVs) seems pretty tricky to use / non-standard for Python users -- most people have their input data in numpy or pandas files, and don't want to parse through a complicated directory structure to inspect results.
It seems like it would be pretty straightforward, however, to create a
scikit-learn
-style wrapper class with afit(X,y)
method that:np.ndarray
s for the inputs and target (and optionally operations + variable names aslist
s ortuple
s of strings)aifeynman
supportsaifeynman.run_aifeynman
with the appropriate arguments + pathsOverall, I think creating an easy-to-use API for this method would do a ton to increase its impact. I was able to get other libraries like PySR and gplearn running in minutes, but this feels like it would take a good bit longer to figure out / set up.