Wrote get_top_k_trials() fn to return pandas df of top trials of ray tune exp

lawrence-chillrud commented 1 year ago

Wrote a function to handle the issue that ray tune's experiment reports only show the last epoch performance rather than the best. For the details please see the function documentation!

cooperlab commented 1 year ago

@lawrence-chillrud I think your solution of dealing with experiment files rather than the object returned by tuner.experiment is preferable. The files are persistent and can be revisited anytime where the object is transient.

I did some investigation and wanted to mention that it's possible to achieve this using the object. The scope=all parameter for ray.tune.ResultGrid.get_best_result will return the single best epoch of all trials - not just the last epoch result. It is also possible to iterate over the Result in the ResultGrid object, calling Result.best_checkpoints on each to build a longer list. Makes me feel a little better about Ray.

cooperlab commented 1 year ago

Should the outputs include the checkpoint directories?

This information is not contained in the results.json files but it can be inferred from the sorting index where you final_df.sort_values.

cooperlab commented 1 year ago

Related to this - I don't see a way to report the best epoch in the trial in the reporter. I think it's hardwired to report the current trial state for running trials, and the last epoch for completed trials. For the MNIST example, the last epoch is the best for most trials due to the stopping criteria, but that may not be true for all projects.

I think we can live with this if we have our own analysis tools. The top-k can be run during an experiment so for long-running experiments we can still monitor progress.

lawrence-chillrud commented 1 year ago

Should the outputs include the checkpoint directories?

This information is not contained in the results.json files but it can be inferred from the sorting index where you final_df.sort_values.

Included the checkpoint_paths of the trials in the final_df returned by the function -- see column checkpoint_path.

Note: if the metric specified to get_top_k_trials is not the same as the metric that was passed to the initial ray tune experiment (i.e., the metric ray tune is optimizing for), then many (or possibly all) of the checkpoints for those trials will not exist, and None will be reported rather than a checkpoint_path. E.g., if ray tune was told to optimize for balanced accuracy, but then the user passes in AUC as the metric to sort get_top_k_trials, trials could be returned in the final_df that have no saved checkpoint. However, if in this example the user passes in balanced accuracy as the metric, then a checkpoint_path exists for every trial (provided drop_dups=True) and will be returned accordingly.

PathologyDataScience / glimr

Wrote get_top_k_trials() fn to return pandas df of top trials of ray tune exp #40