openml / automlbenchmark

OpenML AutoML Benchmarking Framework
https://openml.github.io/automlbenchmark
MIT License
391 stars 130 forks source link

Oboe module missing? #496

Open alanwilter opened 1 year ago

alanwilter commented 1 year ago

In frameworks/oboe/exec.py I see:

sys.path.append(f"{os.path.realpath(os.path.dirname(__file__))}/lib/oboe/automl")
from auto_learner import AutoLearner

But when running yes | python3 runbenchmark.py oboe automl_config_docker 1h4c -m docker -i . -s force it was failing because, well, there is no auto_learner to load.

Don't know if something was missing, but it seems so.

Therefore, I tried this:

from:

sys.path.append(f"{os.path.realpath(os.path.dirname(__file__))}/lib/oboe/automl")
from auto_learner import AutoLearner

to

# sys.path.append(f"{os.path.realpath(os.path.dirname(__file__))}/lib/oboe/automl")
from oboe.auto_learner import AutoLearner

And now it fails with that:

**** Oboe [latest] ****

INFO:__main__:Running oboe with a maximum time of 200s on 4 cores.
WARNING:__main__:We completely ignore the advice to optimize towards metric: auc.
ERROR:frameworks.shared.callee:shape mismatch: value array of shape (35,) could not be broadcast to indexing result of shape (35,1)
multiprocessing.pool.RemoteTraceback:

"""

Traceback (most recent call last):

  File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker

    result = (True, func(*args, **kwds))

  File "/bench/frameworks/oboe/lib/oboe/oboe/model.py", line 102, in kfold_fit_validate

    y_predicted[test_idx] = model.predict(x_te)

ValueError: shape mismatch: value array of shape (35,) could not be broadcast to indexing result of shape (35,1)

"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/bench/frameworks/shared/callee.py", line 70, in call_run

    result = run_fn(ds, config)

  File "/bench/frameworks/oboe/exec.py", line 42, in run

    aml.fit(X_train, y_train)

  File "/bench/frameworks/oboe/lib/oboe/oboe/auto_learner.py", line 759, in fit

    doubling()

  File "/bench/frameworks/oboe/lib/oboe/oboe/auto_learner.py", line 716, in doubling

    self._fit(x_tr, y_tr, categorical=None, ranks=k, runtime_limit=t, t_predicted=t_predicted)

  File "/bench/frameworks/oboe/lib/oboe/oboe/auto_learner.py", line 403, in _fit

    cv_error, cv_predictions = error.get()

  File "/usr/lib/python3.7/multiprocessing/pool.py", line 657, in get

    raise self._value

ValueError: shape mismatch: value array of shape (35,) could not be broadcast to indexing result of shape (35,1)

shape mismatch: value array of shape (35,) could not be broadcast to indexing result of shape (35,1)
Traceback (most recent call last):
  File "/bench/amlb/benchmark.py", line 648, in run
    meta_result = self.benchmark.framework_module.run(self._dataset, task_config)
  File "/bench/frameworks/oboe/__init__.py", line 38, in run
    process_results=process_results)
  File "/bench/frameworks/shared/caller.py", line 150, in run_in_venv
    raise NoResultError(res.error_message)
amlb.results.NoResultError: shape mismatch: value array of shape (35,) could not be broadcast to indexing result of shape (35,1)

If you guys have any clue about what to do here, that would be very appreciated.

alanwilter commented 1 year ago

Actually, I don't even need the automlbenchmark/frameworks/oboe/requirements.txt file, since this line in frameworks/oboe/setup.sh:

# cat ${HERE}/requirements.txt | sed '/^$/d' | while read -r i; do PIP install --no-cache-dir -U "$i"; done

is not really needed.

The whole issue is frameworks/oboe/exec.py

from oboe.auto_learner import AutoLearner --> works.

sys.path.append(f"{os.path.realpath(os.path.dirname(__file__))}/lib/oboe/automl") is wrong since /bench/frameworks/oboe/lib/oboe/automl does not exist.

Anyway, I still get the same shape error.

sebhrusen commented 1 year ago

Oboe had been inactive for a while, so we kinda abandoned it for official benchmarks, but keeping it in the app, which may not have been a good idea... However, it seems that there's been an official 0.2.0 release end of 2021, so the setup may just be broken. I'll look at it, thanks for raising this.

alanwilter commented 1 year ago

Out of 20 frameworks, this is the only one failing with my model.

alanwilter commented 1 year ago

I did look into oboe again and I kind got it to work with your tests but not with mines (always reaching timeout).

You can see my branch https://github.com/alanwilter/oboe

Here's the differences I had to do in OBOE to get it to work:

https://github.com/udellgroup/oboe/compare/master...alanwilter:oboe:master

Then I had to edit https://github.com/openml/automlbenchmark/blob/master/frameworks/oboe/exec.py

diff --git a/frameworks/oboe/exec.py b/frameworks/oboe/exec.py
index 37feebe..0fe773f 100644
--- a/frameworks/oboe/exec.py
+++ b/frameworks/oboe/exec.py
@@ -2,8 +2,7 @@ import logging
 import os
 import sys

-sys.path.append("{}/lib/oboe/automl".format(os.path.realpath(os.path.dirname(__file__))))
-from auto_learner import AutoLearner
+from oboe.auto_learner import AutoLearner

 from frameworks.shared.callee import call_run, result
 from frameworks.shared.utils import Timer
@@ -47,7 +46,7 @@ def run(dataset, config):
     X_test = dataset.test.X
     y_test = dataset.test.y
     with Timer() as predict:
-        predictions = aml.predict(X_test)
+        predictions = aml.predict(X_test.squeeze())
     predictions = predictions.reshape(len(X_test))

     if is_classification:

And now:

yes | python runbenchmark.py oboe example test -f 0 -m docker -s force

would work.