All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More
https://all-hands.dev
MIT License
33.02k stars 3.77k forks source link

[Bug]: Docker Buildx fails on ARM Mac during SWE-bench evaluation #4401

Open AlexCuadron opened 1 day ago

AlexCuadron commented 1 day ago

Is there an existing issue for the same bug?

Describe the bug and reproduction steps

When trying to execute run_infer.sh from the swe_bench evaluation harness using this command: ./evaluation/swe_bench/scripts/run_infer.sh llm.mini HEAD CodeActAgent 100 30 1 "princeton-nlp/SWE-bench_Lite" test The sandbox image build fails as it tries to fetch a specific image from ghcr.io: ERROR: ghcr.io/all-hands-ai/runtime:oh_v0.9.8_image_7c37a7ac_.astropy_s_astropy-12907_tag_latest: not found The error occurs because the base image specified in the Dockerfile (ghcr.io/all-hands-ai/runtime:oh_v0.9.8_image7c37a7ac.astropy_s_astropy-12907_tag_latest) cannot be found in the container registry, causing the build to fail.

However, when I use the --platform=linux/amd64 flag, the build works successfully: docker buildx build --platform=linux/amd64 ...

This indicates that the image is only available for the amd64 architecture, and not for arm64 (the default architecture on Apple Silicon or other systems).

OpenHands Installation

Development workflow

OpenHands Version

0.9.8

Operating System

MacOS

Logs, Errors, Screenshots, and Additional Context

14:44:56 - openhands:ERROR: shared.py:311 - ----------
Error in instance [astropy__astropy-12907]: Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.9.8', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2024-10-15T14:44:56.592665', '--tag=ghcr.io/all-hands-ai/runtime:v0.9.8_6cfdaf6cb5db2ff0f842cb3e493bb767', '--load', '/var/folders/4p/6wz4pkv96qjgkmgjr7sq9hxc0000gn/T/tmp4c_5wxuz']' returned non-zero exit status 1.. Stacktrace:
Traceback (most recent call last):
  File "/Users/acuadron/Documents/OpenDevin/OpenHands/evaluation/utils/shared.py", line 281, in _process_instance_wrapper
    result = process_instance_func(instance, metadata, use_mp)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/acuadron/Documents/OpenDevin/OpenHands/evaluation/swe_bench/run_infer.py", line 370, in process_instance
    runtime = create_runtime(config)
              ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/acuadron/Documents/OpenDevin/OpenHands/openhands/core/main.py", line 77, in create_runtime
    runtime: Runtime = runtime_cls(
                       ^^^^^^^^^^^^
  File "/Users/acuadron/Documents/OpenDevin/OpenHands/openhands/runtime/client/runtime.py", line 162, in __init__
    self.runtime_container_image = build_runtime_image(
                                   ^^^^^^^^^^^^^^^^^^^^
  File "/Users/acuadron/Documents/OpenDevin/OpenHands/openhands/runtime/utils/runtime_build.py", line 283, in build_runtime_image
    _build_sandbox_image(
  File "/Users/acuadron/Documents/OpenDevin/OpenHands/openhands/runtime/utils/runtime_build.py", line 375, in _build_sandbox_image
    image_name = runtime_builder.build(path=docker_folder, tags=tags_to_add)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/acuadron/Documents/OpenDevin/OpenHands/openhands/runtime/builder/docker.py", line 113, in build
    raise subprocess.CalledProcessError(
subprocess.CalledProcessError: Command '['docker', 'buildx', 'build', '--progress=plain', '--build-arg=OPENHANDS_RUNTIME_VERSION=0.9.8', '--build-arg=OPENHANDS_RUNTIME_BUILD_TIME=2024-10-15T14:44:56.592665', '--tag=ghcr.io/all-hands-ai/runtime:v0.9.8_6cfdaf6cb5db2ff0f842cb3e493bb767', '--load', '/var/folders/4p/6wz4pkv96qjgkmgjr7sq9hxc0000gn/T/tmp4c_5wxuz']' returned non-zero exit status 1.

----------[The above error occurred. Retrying... (attempt 1 of 5)]----------
mamoodi commented 1 day ago

CC @xingyaoww , where do those docker images get built?

xingyaoww commented 1 day ago

I manually build these SWE-Bench images and push them to Dockerhub since the original authors didn't release them. I think we might need the fix in #4402 to specify build arch for SWE-Bench evaluation.