openml / automlbenchmark

OpenML AutoML Benchmarking Framework
https://openml.github.io/automlbenchmark
MIT License
391 stars 130 forks source link

Fix tail exception when from_line=None #615

Closed Innixma closed 2 months ago

Innixma commented 3 months ago

In the following code in amlb/runners/aws.py:

        def log_console():
            nonlocal last_console_line
            try:
                output = instance.console_output(Latest=True)
                if 'Output' in output:
                    output = output['Output']   # note that console_output only returns the last 64kB of console
                    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
                    if last_line is not None:
                        last_console_line = last_line['line']
                    if new_lines:
                        new_log = '\n'.join([f"[{job.ext.instance_id}]>{line}" for line in new_lines])
                        around = f"[{job.ext.instance_id}:{job.name}]"
                        log.info(f"{around}>>\n{new_log}\n<<{around}")
            except Exception as e:
                log.exception(e)

new_lines, last_line = tail(...) expects 2 outputs.

However, the implementation of tail previously only provides 2 outputs if from_line != None. Sometimes, from_line can be None, such as during the venv creation. This causes errors in the logs during runbenchmark.py when using AWS mode, preventing the logs of the venv setup to be shown.

Example code:

python runbenchmark.py constantpredictor test -f 0 -t iris -m aws

Mainline:

Starting job aws.test.test.iris.0.constantpredictor.
Starting new EC2 instance with params: constantpredictor test test -t iris -f 0 -Xseed=159423123
Started EC2 instance i-0d4d14e5e848cdeca
[2024-04-03T21:38:36] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: pending [0].
[2024-04-03T21:39:07] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: running [16].
not enough values to unpack (expected 2, got 0)
Traceback (most recent call last):
  File "/home/ubuntu/workspace/code/automlbenchmark/amlb/runners/aws.py", line 448, in log_console
    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
ValueError: not enough values to unpack (expected 2, got 0)
Running EC2 instance i-0d4d14e5e848cdeca: {'instance_type': 'm6i.large', 'launch_time': '2024-04-03 21:38:36+00:00', 'public_dns_name': 'ec2-34-237-76-78.compute-1.amazonaws.com', 'public_ip': '34.237.76.78', 'private_dns_name': 'ip-172-31-56-210.ec2.internal', 'private_ip': '172.31.56.210', 'availability_zone': 'us-east-1f', 'subnet_id': 'subnet-4663a148', 'volumes': [{'type': 'gp3', 'size_gb': 100, 'id': 'vol-0fdfe229429237b7d'}]}
[2024-04-03T21:39:38] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: running [16].
too many values to unpack (expected 2)
Traceback (most recent call last):
  File "/home/ubuntu/workspace/code/automlbenchmark/amlb/runners/aws.py", line 448, in log_console
    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
ValueError: too many values to unpack (expected 2)
[2024-04-03T21:40:08] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: running [16].
too many values to unpack (expected 2)
Traceback (most recent call last):
  File "/home/ubuntu/workspace/code/automlbenchmark/amlb/runners/aws.py", line 448, in log_console
    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
ValueError: too many values to unpack (expected 2)
[MONITORING] [aws.test.test.iris.0.constantpredictor] CPU Utilization: 0.4%
[MONITORING] [aws.test.test.iris.0.constantpredictor] Memory Usage: 6.0%
[MONITORING] [aws.test.test.iris.0.constantpredictor] Disk Usage: 72.4%
[2024-04-03T21:40:39] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: running [16].
too many values to unpack (expected 2)
Traceback (most recent call last):
  File "/home/ubuntu/workspace/code/automlbenchmark/amlb/runners/aws.py", line 448, in log_console
    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
ValueError: too many values to unpack (expected 2)
[2024-04-03T21:41:10] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: running [16].
too many values to unpack (expected 2)
Traceback (most recent call last):
  File "/home/ubuntu/workspace/code/automlbenchmark/amlb/runners/aws.py", line 448, in log_console
    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
ValueError: too many values to unpack (expected 2)
[2024-04-03T21:41:41] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: running [16].
too many values to unpack (expected 2)
Traceback (most recent call last):
  File "/home/ubuntu/workspace/code/automlbenchmark/amlb/runners/aws.py", line 448, in log_console
    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
ValueError: too many values to unpack (expected 2)
[2024-04-03T21:42:12] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: running [16].
too many values to unpack (expected 2)
Traceback (most recent call last):
  File "/home/ubuntu/workspace/code/automlbenchmark/amlb/runners/aws.py", line 448, in log_console
    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
ValueError: too many values to unpack (expected 2)
[MONITORING] [aws.test.test.iris.0.constantpredictor] CPU Utilization: 1.3%
[MONITORING] [aws.test.test.iris.0.constantpredictor] Memory Usage: 6.0%
[MONITORING] [aws.test.test.iris.0.constantpredictor] Disk Usage: 72.4%
[2024-04-03T21:42:43] checking job aws.test.test.iris.0.constantpredictor on instance i-0d4d14e5e848cdeca: stopping [64].
too many values to unpack (expected 2)
Traceback (most recent call last):
  File "/home/ubuntu/workspace/code/automlbenchmark/amlb/runners/aws.py", line 448, in log_console
    new_lines, last_line = tail(output, from_line=last_console_line, include_line=False, splitlines=True)
ValueError: too many values to unpack (expected 2)
EC2 instance i-0d4d14e5e848cdeca is stopping: Client.InstanceInitiatedShutdown: Instance initiated shutdown
Job `aws.test.test.iris.0.constantpredictor` executed in 250.958 seconds.

This PR:

Starting job aws.test.test.iris.0.constantpredictor.
Starting new EC2 instance with params: constantpredictor test test -t iris -f 0 -Xseed=2051243710
Started EC2 instance i-07b439ebab7ee87f6
[2024-04-03T21:32:04] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: pending [0].
[2024-04-03T21:32:34] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: running [16].
Running EC2 instance i-07b439ebab7ee87f6: {'instance_type': 'm6i.large', 'launch_time': '2024-04-03 21:32:04+00:00', 'public_dns_name': 'ec2-34-230-28-9.compute-1.amazonaws.com', 'public_ip': '34.230.28.9', 'private_dns_name': 'ip-172-31-42-151.ec2.internal', 'private_ip': '172.31.42.151', 'availability_zone': 'us-east-1b', 'subnet_id': 'subnet-18584c44', 'volumes': [{'type': 'gp3', 'size_gb': 100, 'id': 'vol-07c5ce0558ec7b721'}]}
[2024-04-03T21:33:05] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: running [16].
[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]>>
[i-07b439ebab7ee87f6]>[   21.090498] cloud-init[1291]: Get:33 http://us-east-1.ec2.archive.ubuntu.com/ubuntu focal-backports/universe amd64 c-n-f Metadata [880 B]
[i-07b439ebab7ee87f6]>[   21.102656] cloud-init[1291]: Get:34 http://us-east-1.ec2.archive.ubuntu.com/ubuntu focal-backports/multiverse amd64 c-n-f Metadata [116 B]
[i-07b439ebab7ee87f6]>[   21.114772] cloud-init[1291]: Get:35 http://security.ubuntu.com/ubuntu focal-security/restricted Translation-en [377 kB]
[i-07b439ebab7ee87f6]>[   21.118938] cloud-init[1291]: Get:36 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 c-n-f Metadata [552 B]
[i-07b439ebab7ee87f6]>[   21.125892] cloud-init[1291]: Get:37 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [952 kB]
[i-07b439ebab7ee87f6]>[   21.130337] cloud-init[1291]: Get:38 http://security.ubuntu.com/ubuntu focal-security/universe Translation-en [200 kB]
[i-07b439ebab7ee87f6]>[   21.135423] cloud-init[1291]: Get:39 http://security.ubuntu.com/ubuntu focal-security/universe amd64 c-n-f Metadata [19.2 kB]
[i-07b439ebab7ee87f6]>[   21.140181] cloud-init[1291]: Get:40 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [23.9 kB]
[i-07b439ebab7ee87f6]>[   21.144941] cloud-init[1291]: Get:41 http://security.ubuntu.com/ubuntu focal-security/multiverse Translation-en [5796 B]
[i-07b439ebab7ee87f6]>[   21.149327] cloud-init[1291]: Get:42 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 c-n-f Metadata [548 B]
<<[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]
[2024-04-03T21:33:36] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: running [16].
[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]>>
[i-07b439ebab7ee87f6]>[   52.792235] cloud-init[1291]: Reading package lists...
[i-07b439ebab7ee87f6]>         Starting Update APT News...
[i-07b439ebab7ee87f6]>         Starting Update the local ESM caches...
[i-07b439ebab7ee87f6]>[   52.933889] cloud-init[1291]: Hit:1 http://us-east-1.ec2.archive.ubuntu.com/ubuntu focal InRelease
[i-07b439ebab7ee87f6]>[   52.937794] cloud-init[1291]: Hit:2 http://us-east-1.ec2.archive.ubuntu.com/ubuntu focal-updates InRelease
[i-07b439ebab7ee87f6]>[   52.947920] cloud-init[1291]: Hit:3 http://us-east-1.ec2.archive.ubuntu.com/ubuntu focal-backports InRelease
[i-07b439ebab7ee87f6]>[   52.961085] cloud-init[1291]: Hit:4 http://security.ubuntu.com/ubuntu focal-security InRelease
[i-07b439ebab7ee87f6]>[   53.068705] cloud-init[1291]: Hit:5 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu focal InRelease
[i-07b439ebab7ee87f6]>[  OK  ] Finished Update APT News.
[i-07b439ebab7ee87f6]>[  OK  ] Finished Update the local ESM caches.
<<[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]
[MONITORING] [aws.test.test.iris.0.constantpredictor] CPU Utilization: 1.4%
[MONITORING] [aws.test.test.iris.0.constantpredictor] Memory Usage: 6.0%
[MONITORING] [aws.test.test.iris.0.constantpredictor] Disk Usage: 72.4%
[2024-04-03T21:34:07] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: running [16].
[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]>>
[i-07b439ebab7ee87f6]>[   80.301734] cloud-init[1291]: Setting up gcc (4:9.3.0-1ubuntu2) ...
[i-07b439ebab7ee87f6]>[   80.347933] cloud-init[1291]: Setting up g++-9 (9.4.0-1ubuntu1~20.04.2) ...
[i-07b439ebab7ee87f6]>[   80.363309] cloud-init[1291]: Setting up python3.8-dev (3.8.10-0ubuntu1~20.04.9) ...
[i-07b439ebab7ee87f6]>[   80.380614] cloud-init[1291]: Setting up g++ (4:9.3.0-1ubuntu2) ...
[i-07b439ebab7ee87f6]>[   80.424472] cloud-init[1291]: update-alternatives: using /usr/bin/g++ to provide /usr/bin/c++ (c++) in auto mode
[i-07b439ebab7ee87f6]>[   80.435436] cloud-init[1291]: Setting up python3.9-dev (3.9.18-1+focal1) ...
[i-07b439ebab7ee87f6]>[   80.468277] cloud-init[1291]: Setting up build-essential (12.8ubuntu1.1) ...
[i-07b439ebab7ee87f6]>[   80.487970] cloud-init[1291]: Setting up python3-dev (3.8.2-0ubuntu2) ...
[i-07b439ebab7ee87f6]>[   80.505327] cloud-init[1291]: Processing triggers for libc-bin (2.31-0ubuntu9.9) ...
[i-07b439ebab7ee87f6]>[   80.532840] cloud-init[1291]: Processing triggers for man-db (2.9.1-1) ...
<<[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]
[2024-04-03T21:34:38] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: running [16].
[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]>>
[i-07b439ebab7ee87f6]>[  120.000624] cloud-init[1291]:   Preparing metadata (pyproject.toml): started
[i-07b439ebab7ee87f6]>[  120.120463] cloud-init[1291]:   Preparing metadata (pyproject.toml): finished with status 'done'
[i-07b439ebab7ee87f6]>[  120.126510] cloud-init[1291]: Building wheels for collected packages: liac-arff
[i-07b439ebab7ee87f6]>[  120.129519] cloud-init[1291]:   Building wheel for liac-arff (pyproject.toml): started
[i-07b439ebab7ee87f6]>[  120.265429] cloud-init[1291]:   Building wheel for liac-arff (pyproject.toml): finished with status 'done'
[i-07b439ebab7ee87f6]>[  120.269030] cloud-init[1291]:   Created wheel for liac-arff: filename=liac_arff-2.5.0-py3-none-any.whl size=11716 sha256=becb05a9ef28aef244139dd0eb4c4ab29b81ad8a39e405ca74af6e42bdabb8ab
[i-07b439ebab7ee87f6]>[  120.274170] cloud-init[1291]:   Stored in directory: /tmp/pip-ephem-wheel-cache-tbfzwus6/wheels/08/82/8b/5c514221984e88c059b94e36a71d4722e590acaae04deab22e
[i-07b439ebab7ee87f6]>[  120.280462] cloud-init[1291]: Successfully built liac-arff
[i-07b439ebab7ee87f6]>[  120.372555] cloud-init[1291]: Installing collected packages: liac-arff
[i-07b439ebab7ee87f6]>[  120.390147] cloud-init[1291]: Successfully installed liac-arff-2.5.0
<<[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]
[2024-04-03T21:35:09] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: running [16].
[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]>>
[i-07b439ebab7ee87f6]>[  150.025304] cloud-init[1291]: Requirement already satisfied: scipy>=1.3.2 in ./venv/lib/python3.9/site-packages (from scikit-learn==1.2.2) (1.13.0)
[i-07b439ebab7ee87f6]>[  150.030513] cloud-init[1291]: Requirement already satisfied: joblib>=1.1.1 in ./venv/lib/python3.9/site-packages (from scikit-learn==1.2.2) (1.2.0)
[i-07b439ebab7ee87f6]>[  150.035690] cloud-init[1291]: Requirement already satisfied: threadpoolctl>=2.0.0 in ./venv/lib/python3.9/site-packages (from scikit-learn==1.2.2) (3.4.0)
[i-07b439ebab7ee87f6]>[  150.055596] cloud-init[1291]: Downloading scikit_learn-1.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB)
[i-07b439ebab7ee87f6]>[  150.163721] cloud-init[1291]:    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.6/9.6 MB 91.3 MB/s eta 0:00:00
[i-07b439ebab7ee87f6]>[  150.359274] cloud-init[1291]: Installing collected packages: scikit-learn
[i-07b439ebab7ee87f6]>[  150.362141] cloud-init[1291]:   Attempting uninstall: scikit-learn
[i-07b439ebab7ee87f6]>[  150.364730] cloud-init[1291]:     Found existing installation: scikit-learn 1.4.1.post1
[i-07b439ebab7ee87f6]>[  150.397901] cloud-init[1291]:     Uninstalling scikit-learn-1.4.1.post1:
[i-07b439ebab7ee87f6]>[  150.405598] cloud-init[1291]:       Successfully uninstalled scikit-learn-1.4.1.post1
<<[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]
[2024-04-03T21:35:40] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: running [16].
[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]>>
[i-07b439ebab7ee87f6]><14>Apr  3 21:35:01 cloud-init: #############################################################
[i-07b439ebab7ee87f6]>-----BEGIN SSH HOST KEY KEYS-----
[i-07b439ebab7ee87f6]>-----END SSH HOST KEY KEYS-----
[i-07b439ebab7ee87f6]>[  171.608139] cloud-init[1291]: Cloud-init v. 22.4.2-0ubuntu0~20.04.2 running 'modules:final' at Wed, 03 Apr 2024 21:32:30 +0000. Up 20.02 seconds.
[i-07b439ebab7ee87f6]>[  171.613325] cloud-init[1291]: AutoML benchmark aws.test.test.iris.0.constantpredictor completed after 171.60 s
[i-07b439ebab7ee87f6]>[  OK  ] Finished Execute cloud user/final scripts.
[i-07b439ebab7ee87f6]>[  OK  ] Reached target Cloud-init target.
<<[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]
[MONITORING] [aws.test.test.iris.0.constantpredictor] CPU Utilization: 0.5%
[MONITORING] [aws.test.test.iris.0.constantpredictor] Memory Usage: 6.0%
[MONITORING] [aws.test.test.iris.0.constantpredictor] Disk Usage: 72.4%
[2024-04-03T21:36:12] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: running [16].
[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]>>
[i-07b439ebab7ee87f6]><14>Apr  3 21:35:01 cloud-init: #############################################################
[i-07b439ebab7ee87f6]>-----BEGIN SSH HOST KEY KEYS-----
[i-07b439ebab7ee87f6]>-----END SSH HOST KEY KEYS-----
[i-07b439ebab7ee87f6]>[  171.608139] cloud-init[1291]: Cloud-init v. 22.4.2-0ubuntu0~20.04.2 running 'modules:final' at Wed, 03 Apr 2024 21:32:30 +0000. Up 20.02 seconds.
[i-07b439ebab7ee87f6]>[  171.613325] cloud-init[1291]: AutoML benchmark aws.test.test.iris.0.constantpredictor completed after 171.60 s
[i-07b439ebab7ee87f6]>[  OK  ] Finished Execute cloud user/final scripts.
[i-07b439ebab7ee87f6]>[  OK  ] Reached target Cloud-init target.
<<[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]
[2024-04-03T21:36:43] checking job aws.test.test.iris.0.constantpredictor on instance i-07b439ebab7ee87f6: stopping [64].
[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]>>
[i-07b439ebab7ee87f6]>[  OK  ] Stopped Create System Users.
[i-07b439ebab7ee87f6]>[  OK  ] Stopped Remount Root and Kernel File Systems.
[i-07b439ebab7ee87f6]>[  OK  ] Stopped File System Check on Root Device.
[i-07b439ebab7ee87f6]>[  OK  ] Stopped Device-Mapper Multipath Device Controller.
[i-07b439ebab7ee87f6]>[  OK  ] Stopped Monitoring of LVM2… dmeventd or progress polling.
[i-07b439ebab7ee87f6]>[  OK  ] Reached target Shutdown.
[i-07b439ebab7ee87f6]>[  OK  ] Reached target Final Step.
[i-07b439ebab7ee87f6]>[  OK  ] Finished Power-Off.
[i-07b439ebab7ee87f6]>[  OK  ] Reached target Power-Off.
[i-07b439ebab7ee87f6]>[  240.740214] reboot: Power down
<<[i-07b439ebab7ee87f6:aws.test.test.iris.0.constantpredictor]
EC2 instance i-07b439ebab7ee87f6 is stopping: Client.InstanceInitiatedShutdown: Instance initiated shutdown
Job `aws.test.test.iris.0.constantpredictor` executed in 281.736 seconds.