temporalio / docker-builds

Temporal service Docker images build
https://hub.docker.com/r/temporaliotest/auto-setup
MIT License
30 stars 59 forks source link

[Bug] 1.23.1 images don't work on linux/arm64 (Azure) #194

Closed robcao closed 5 months ago

robcao commented 6 months ago

What are you really trying to do?

I'm trying to run the newest version of temporal server, 1.23.1, on an Azure linux/arm64 virtual machine. There are no issues with running temporal server 1.23.0.

There are no issues with running temporal server 1.23.1 on an Azure linux/amd64 virtual machine.

I have confirmed the issue is present on the following images for 1.23.1

Describe the bug

Binaries such as temporal-server, temporal-sql-tool, etc in the 1.23.1 images are failing to start on linux/arm64 machines with the following error: /usr/local/bin/temporal-server: cannot execute binary file: Exec format error

azureuser@temporal:/temporal-test$ docker compose up 
WARN[0000] /temporal-test/docker-compose.yaml: `version` is obsolete 
[+] Running 5/3
 ✔ Network temporal-network        Created                                                                            0.0s 
 ✔ Container temporal-postgresql   Created                                                                            0.1s 
 ✔ Container temporal              Created                                                                            0.1s 
 ✔ Container temporal-ui           Created                                                                            0.1s 
 ✔ Container temporal-admin-tools  Created                                                                            0.1s 
Attaching to temporal, temporal-admin-tools, temporal-postgresql, temporal-ui
temporal-postgresql   | The files belonging to this database system will be owned by user "postgres".
temporal-postgresql   | This user must also own the server process.
temporal-postgresql   | 
temporal-postgresql   | The database cluster will be initialized with locale "en_US.utf8".
temporal-postgresql   | The default database encoding has accordingly been set to "UTF8".
temporal-postgresql   | The default text search configuration will be set to "english".
temporal-postgresql   | 
temporal-postgresql   | Data page checksums are disabled.
temporal-postgresql   | 
temporal-postgresql   | fixing permissions on existing directory /var/lib/postgresql/data ... ok
temporal-postgresql   | creating subdirectories ... ok
temporal-postgresql   | selecting dynamic shared memory implementation ... posix
temporal-postgresql   | selecting default max_connections ... 100
temporal-postgresql   | selecting default shared_buffers ... 128MB
temporal-postgresql   | selecting default time zone ... Etc/UTC
temporal-postgresql   | creating configuration files ... ok
temporal-postgresql   | running bootstrap script ... ok
temporal              | TEMPORAL_ADDRESS is not set, setting it to 172.22.0.3:7233
temporal              | Waiting for PostgreSQL to startup.
temporal-postgresql   | performing post-bootstrap initialization ... ok
temporal-ui           | 2024/05/01 04:55:16 Loading config; env=docker,configDir=config
temporal-ui           | 2024/05/01 04:55:16 Loading config files=[config/docker.yaml]
temporal-ui           | 2024/05/01 04:55:16 Loading config; env=docker,configDir=config
temporal-ui           | 2024/05/01 04:55:16 Loading config files=[config/docker.yaml]
temporal-ui           | 
temporal-ui           |    ____    __
temporal-ui           |   / __/___/ /  ___
temporal-ui           |  / _// __/ _ \/ _ \
temporal-ui           | /___/\__/_//_/\___/ v4.9.0
temporal-ui           | High performance, minimalist Go web framework
temporal-ui           | https://echo.labstack.com
temporal-ui           | ____________________________________O/_______
temporal-ui           |                                     O\
temporal-ui           | ⇨ http server started on [::]:8080
temporal-postgresql   | syncing data to disk ... ok
temporal-postgresql   | 
temporal-postgresql   | 
temporal-postgresql   | Success. You can now start the database server using:
temporal-postgresql   | 
temporal-postgresql   |     pg_ctl -D /var/lib/postgresql/data -l logfile start
temporal-postgresql   | 
temporal-postgresql   | initdb: warning: enabling "trust" authentication for local connections
temporal-postgresql   | You can change this by editing pg_hba.conf or using the option -A, or
temporal-postgresql   | --auth-local and --auth-host, the next time you run initdb.
temporal-postgresql   | waiting for server to start....2024-05-01 04:55:16.500 UTC [48] LOG:  starting PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
temporal-postgresql   | 2024-05-01 04:55:16.504 UTC [48] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
temporal-postgresql   | 2024-05-01 04:55:16.517 UTC [49] LOG:  database system was shut down at 2024-05-01 04:55:16 UTC
temporal-postgresql   | 2024-05-01 04:55:16.524 UTC [48] LOG:  database system is ready to accept connections
temporal-postgresql   |  done
temporal-postgresql   | server started
temporal-postgresql   | CREATE DATABASE
temporal-postgresql   | 
temporal-postgresql   | 
temporal-postgresql   | /usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*
temporal-postgresql   | 
temporal-postgresql   | 2024-05-01 04:55:16.788 UTC [48] LOG:  received fast shutdown request
temporal-postgresql   | waiting for server to shut down....2024-05-01 04:55:16.793 UTC [48] LOG:  aborting any active transactions
temporal-postgresql   | 2024-05-01 04:55:16.795 UTC [48] LOG:  background worker "logical replication launcher" (PID 55) exited with exit code 1
temporal-postgresql   | 2024-05-01 04:55:16.795 UTC [50] LOG:  shutting down
temporal              | Waiting for PostgreSQL to startup.
temporal-postgresql   | 2024-05-01 04:55:16.833 UTC [48] LOG:  database system is shut down
temporal-postgresql   |  done
temporal-postgresql   | server stopped
temporal-postgresql   | 
temporal-postgresql   | PostgreSQL init process complete; ready for start up.
temporal-postgresql   | 
temporal-postgresql   | 2024-05-01 04:55:16.925 UTC [1] LOG:  starting PostgreSQL 13.14 (Debian 13.14-1.pgdg120+2) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
temporal-postgresql   | 2024-05-01 04:55:16.925 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
temporal-postgresql   | 2024-05-01 04:55:16.925 UTC [1] LOG:  listening on IPv6 address "::", port 5432
temporal-postgresql   | 2024-05-01 04:55:16.932 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
temporal-postgresql   | 2024-05-01 04:55:16.944 UTC [63] LOG:  database system was shut down at 2024-05-01 04:55:16 UTC
temporal-postgresql   | 2024-05-01 04:55:16.952 UTC [1] LOG:  database system is ready to accept connections
temporal              | PostgreSQL started.
temporal              | Setup PostgreSQL schema.
temporal              | /etc/temporal/auto-setup.sh: line 234: /usr/local/bin/temporal-sql-tool: cannot execute binary file: Exec format error
temporal              | /etc/temporal/start-temporal.sh: line 16: /usr/local/bin/temporal-server: cannot execute binary file: Exec format error
temporal              | /etc/temporal/start-temporal.sh: line 16: /usr/local/bin/temporal-server: No error information
temporal exited with code 1

Minimal Reproduction

I am using an Azure linux/arm64 VM, but I believe this is not an Azure specific issue.

On a linux/arm64 machine, run the docker-compose-postgres.yaml on the v.1.23.1 tag of the docker compose samples.

View that the auto-setup container fails to start, with the following logs:

TEMPORAL_ADDRESS is not set, setting it to 172.21.0.3:7233
Waiting for PostgreSQL to startup.
Waiting for PostgreSQL to startup.
PostgreSQL started.
Setup PostgreSQL schema.
/etc/temporal/auto-setup.sh: line 234: /usr/local/bin/temporal-sql-tool: cannot execute binary file: Exec format error
/etc/temporal/start-temporal.sh: line 16: /usr/local/bin/temporal-server: cannot execute binary file: Exec format error
/etc/temporal/start-temporal.sh: line 16: /usr/local/bin/temporal-server: No error information

Environment/Versions

Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: ARM Model name: Neoverse-N1 Model: 1 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 Stepping: r3p1 BogoMIPS: 50.00 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp Caches (sum of all):
L1d: 256 KiB (4 instances) L1i: 256 KiB (4 instances) L2: 4 MiB (4 instances) L3: 32 MiB (1 instance) NUMA:
NUMA node(s): 1 NUMA node0 CPU(s): 0-3 Vulnerabilities:
Gather data sampling: Not affected Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Mitigation; PTI Mmio stale data: Not affected Retbleed: Not affected Spec rstack overflow: Not affected Spec store bypass: Not affected Spectre v1: Mitigation; __user pointer sanitization Spectre v2: Mitigation; CSV2, BHB Srbds: Not affected Tsx async abort: Not affected


- Temporal Version: [e.g. 1.14.0?] 1.23.1
- Are you using Docker or Kubernetes or building Temporal from source? Docker

### Additional context

Inspecting the docker images of temporalio/server:1.23.0 and temporalio/server:1.23.1, it looks like the build was changed as part of this pull request: https://github.com/temporalio/docker-builds/pull/190

#### 1.23.0
![image](https://github.com/temporalio/docker-builds/assets/11479329/a6529100-3cd0-4a59-8b72-265e25227c94)

#### 1.23.1

![image](https://github.com/temporalio/docker-builds/assets/11479329/c5a19c9c-078b-4104-b9b6-00cd4a354cfb)
Julien4218 commented 6 months ago

I'm noticing the same issue with 1.23.1 on aws/graviton, 1.23.0 was working fine.