automl / amltk

A build-it-yourself AutoML Framework
https://automl.github.io/amltk/
BSD 3-Clause "New" or "Revised" License
56 stars 3 forks source link

[JOSS REVIEW] Running 'Performing HPO with Post-Hoc Ensembling' example fails #274

Closed gomezzz closed 2 months ago

gomezzz commented 3 months ago

After the workaround with swig mentioned in #273 I am trying the HPO examples now. The first one works well but for the second I receive

Traceback (most recent call last):
  File ".../amltk/ex5.py", line 305, in <module>
    scheduler.run(timeout=30, wait=True)(9)
TypeError: 'ExitState' object is not callable

I tried increasing the timeout to 30 to see if that would fixed but without success. Is there anything else I might have to change?

(opened as part of JOSS Review https://github.com/openjournals/joss-reviews/issues/6367 )

eddiebergman commented 3 months ago

I'm not sure why that (9) is uncommeneted in the last cell block of the example, I need to fix that!

The # (9)! is essentially for the generated documentation of the example page

if __name__ == "__main__":
    scheduler.run(timeout=5, wait=True)  # (9)!

# 9. We use [`Scheduler.run()`][amltk.scheduling.Scheduler.run] to run the scheduler.
#  Here we set it to run briefly for 5 seconds and wait for remaining tasks to finish
#  before continuing.

It seems to be identical and work for the others but not this one.

Good issue!

gomezzz commented 2 months ago

Yes, commenting out the (9) fixed it for me :v:

gomezzz commented 2 months ago

A small suggestion for the HPO examples. For interpretability and user understanding I would suggest not to print the entire dataframe at the end but just some of the columns e.g.

    print(
        trial_history.df()[
            [
                "config:Pipeline:RandomForestClassifier:criterion",
                "config:Pipeline:RandomForestClassifier:max_features",
                "config:Pipeline:RandomForestClassifier:n_estimators",
                "metric:accuracy [0.0, 1.0] (maximize)",
                "summary:train/acc",
                "summary:val/acc",
                "summary:test/acc",
            ]
        ]
    )

that makes it much easier to understand what is shown as you get this :)

Trial history:
                                                   config:Pipeline:RandomForestClassifier:criterion  config:Pipeline:RandomForestClassifier:max_features  config:Pipeline:RandomForestClassifier:n_estimators  metric:accuracy [0.0, 1.0] (maximize)  summary:train/acc  summary:val/acc  summary:test/acc
name                                                                                                                                                                                                                                                                                                      
config_id=1_seed=1608637542_budget=None_instanc...                                         log_loss                                            0.05857                                                  100                                     0.75                1.0             0.75              0.76
config_id=2_seed=1608637542_budget=None_instanc...                                             gini                                           0.877733                                                   39                                     0.73                1.0             0.73              0.74
config_id=3_seed=1608637542_budget=None_instanc...                                          entropy                                            0.49934                                                   68                                    0.735                1.0            0.735             0.775
config_id=4_seed=1608637542_budget=None_instanc...                                          entropy                                           0.568753                                                   14                                     0.71           0.996667             0.71              0.73
config_id=5_seed=1608637542_budget=None_instanc...                                         log_loss                                           0.350444                                                   49                                     0.75                1.0             0.75             0.795
...                                                                                             ...                                                ...                                                  ...                                      ...                ...              ...               ...
config_id=60_seed=1608637542_budget=None_instan...                                         log_loss                                           0.494299                                                   91                                     0.76                1.0             0.76              0.78
config_id=61_seed=1608637542_budget=None_instan...                                         log_loss                                           0.549975                                                   90                                    0.745                1.0            0.745              0.79
config_id=62_seed=1608637542_budget=None_instan...                                         log_loss                                           0.606144                                                   91                                    0.735                1.0            0.735              0.77
config_id=63_seed=1608637542_budget=None_instan...                                         log_loss                                            0.57088                                                   90                                    0.745                1.0            0.745             0.795
config_id=64_seed=1608637542_budget=None_instan...                                             gini                                           0.659984                                                   84                                    0.755                1.0            0.755             0.765