michelbierlaire / biogeme

Biogeme is an open source freeware designed for the maximum likelihood estimation of parametric models in general, with a special emphasis on discrete choice models.
Other
106 stars 51 forks source link

Callback system to the optimisation #21

Open tcapelle opened 1 year ago

tcapelle commented 1 year ago

Hello Michel,

I am curious how easy it would be to add a callback system to the optimization in biogeme. The idea would be to be able to add experiment tracking like W&B, so we can keep track of the different experiments in a central place. Basically, you would need to be able to call:

PD: I am trying to convert my friend Ricardo, so he stops staring at the terminal logs...

michelbierlaire commented 5 months ago

Sorry. I did not follow up back then. Are you still interested in this?

tcapelle commented 4 months ago

I am! I need this to convert Ricardo =P. The idea is having a dashboard with the metrics and inputs and outputs of the experiments that looks like this:

Screenshot 2024-04-29 at 16 30 20

I am happy to discuss this over a call if you want =)

tcapelle commented 4 months ago

For

/docs/examples/latentbis > python plot_m01_latent_variable.py 

Workspace view

image

Config and overview

image image
tcapelle commented 4 months ago

I would need to raise PRs on both biogeme + biogeme_optimization

michelbierlaire commented 4 months ago

Well, I need to understand better what you would need. Now, the object that gathers and process the estimation results, as well as the reports about the iterations, is "bioResults". Would it make sense to enrich this object to contain additional data about the running of the algorithm?

tcapelle commented 4 months ago

There are 2 things:

michelbierlaire commented 4 months ago

Most functions have a logger. They start with statements such as logger = logging.getLogger(__name__). See simple_bounds.py for instance. If you identify the quantities that are missing in the logger, and in bioResults, it should be easy to add them.

tcapelle commented 4 months ago

Thanks, yeah that's what I actually did for that example, I redifined logmessage:

       def logmessage() -> None:
            """Send messages to the logger"""
            values_to_report = [k]
            if variable_names is not None:
                values_to_report += list(iterate)
            values_to_report += [
                current_function.function,
                the_function.relative_gradient_norm,
                float(radius),
                rho,
                status,
            ]
            import wandb
            wandb.log({"current_function": current_function.function,
                       "relgrad": the_function.relative_gradient_norm,
                       "radius": float(radius),
                       "rho": rho})
            logger.info(the_formatter.formatted_row(values_to_report))

To stream the metrics real time, we need to call wandb.log when the metric is computed, in this case, the fitting metrics. If values_to_report where a dictionary, the logging would be extremely simple. wandb.log(values_to_report)