heal-research / pyoperon

Python bindings and scikit-learn interface for the Operon library for symbolic regression.
MIT License
38 stars 11 forks source link

Feature request: Callbacks #18

Open romanovzky opened 3 months ago

romanovzky commented 3 months ago

Since SymbolicRegressor is an iterative process, akin to online learning algorithms (neural nets, etc), it would be useful to have callbacks that could be called at different stages. For example, a monitoring callback could print the generation number, the best objective values, etc. Another example would be to implement early stop callback under certain conditions. Yet another example would be to better integrate with optuna.

foolnotion commented 3 months ago

A general monitoring callback is already supported by the bindings: https://github.com/heal-research/pyoperon/blob/5ddcd9de3dc5859167463c4a90c675980c4c5347/example/operon-bindings.py#L105

But calling python lambdas from C++ can be unreliable due to the GIL. We could expose this as callback=... in the SymbolicRegressor. More detailed callbacks would require changes to the C++ library and to implement some termination operators.

Regarding Optuna, is this similar to what you had in mind? https://gist.github.com/foolnotion/226fe764b9af79f219d63c8c10b0d497

romanovzky commented 3 months ago

Indeed exposing callbacks=... is the customary API design for ML libraries. See for example:

other examples of ML libraries using callback API: xgboost, lightgbm.

All of these provide the required functionality to use callbacks with TensorBoard, a leading ML experiment tracker, and I imagine that the same could be done for MLFlow, see for example how one can use Keras callbacks.

Regarding optuna: I was thinking about using callbacks to extend optuna's functionality with pyoperon. For example, optuna has integration callbacks to use pruners to kill unpromising trials. It's an example of application, not strictly part of the feature requested.