Open seanmor5 opened 1 year ago
Could you please paste the output for df -h
?
I've seen similar behavior when VMs report running out of space (even when technically they haven't).
BTW, I'd recommend using /usr/lib/jvm/java-11-openjdk
as opposed to /usr/lib/jvm/default-jvm
. Bazel has trouble building with OpenJDK 17+ on some operating systems.
@yesudeep I re-ran in a Docker container on my Mac with the updated the java path to /usr/lib/jv/java-11-openjdk
, an increase in memory swap, available memory, and disk space. Here's the result of df -h
:
#5 [2/5] RUN df -h
#5 sha256:b69533de15570400c2b99b68d306cb1123a95f090c0badae699f1b9a2302e60a
#5 0.197 Filesystem Size Used Available Use% Mounted on
#5 0.198 overlay 495.8G 49.2G 421.4G 10% /
#5 0.198 tmpfs 64.0M 0 64.0M 0% /dev
#5 0.198 shm 64.0M 0 64.0M 0% /dev/shm
#5 0.198 /dev/vda1 495.8G 49.2G 421.4G 10% /etc/resolv.conf
#5 0.198 /dev/vda1 495.8G 49.2G 421.4G 10% /etc/hosts
#5 0.198 tmpfs 64.0M 0 64.0M 0% /proc/kcore
#5 0.198 tmpfs 64.0M 0 64.0M 0% /proc/keys
#5 0.198 tmpfs 64.0M 0 64.0M 0% /proc/timer_list
#5 0.198 tmpfs 64.0M 0 64.0M 0% /proc/sched_debug
#5 0.198 tmpfs 15.7G 0 15.7G 0% /sys/firmware
#5 DONE 0.2s
Still gets stuck on "Patching repository....". I will check one of the EC2 instances as well
Namaste @seanmor5
Thank you for responding. I don't have an aarch64 instance/machine handy, but have built myself a Qemu vm image using this quick and dirty script in case someone else wants to test. I was able to reproduce the problem while building bazel on the vm in emulation mode on Linux. I'll look at it in more detail soonish.
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail
BINARY=$(basename $0)
BINARY_VERSION="0.0.0"
OS_TYPE="$(echo $OSTYPE | tr '[:upper:]' '[:lower:]')"
OS_DISTRO="alpine"
#OS_CHANNEL="edge"
OS_CHANNEL="latest-stable"
# Version
OS_VERSION="3.17.1"
#------------------------------------------------------------------------------
# VM configuration
#------------------------------------------------------------------------------
# VM directory.
VM_DIR="$HOME/.vm"
# Number of CPU cores and threads.
DEFAULT_VM_CORES=2
DEFAULT_VM_THREADS=4
# Storage and memory
DEFAULT_VM_RAM_SIZE=4G
DEFAULT_VM_DISK_SIZE=54G
#------------------------------------------------------------------------------
# Host system based configuration
#------------------------------------------------------------------------------
HOST_CPU_ARCH="$(uname -m)"
# SSH port forwarding.
SSHD_HOST_ADDR=0.0.0.0
SSHD_HOST_PORT=2222
SSHD_GUEST_PORT=22
SSHD_GUEST_ADDR=10.0.2.15
SSHD_HOST_FWD="tcp:${SSHD_HOST_ADDR}:${SSHD_HOST_PORT}-${SSHD_GUEST_ADDR}:${SSHD_GUEST_PORT}"
#------------------------------------------------------------------------------
# Begin setup
#------------------------------------------------------------------------------
ARGS="$*"
function extract_kernel() {
TMP_DIR="$1"
VM_DISK_DIR="$2"
}
function sha_256_digest() {
SHA256SUM=sha256sum
case "$OS_TYPE" in
freebsd*)
SHA256SUM=gsha256sum
;;
*dragonfly*)
SHA256SUM=gsha256sum
;;
*) ;;
esac
echo -n "$1" | $SHA256SUM | head -c 10
}
export -f sha_256_digest
function setup_vm() {
local os_cpu_arch="$HOST_CPU_ARCH"
local os_version="$OS_VERSION"
local threads=$DEFAULT_VM_THREADS
local cores=$DEFAULT_VM_CORES
local ram_size="$DEFAULT_VM_RAM_SIZE"
local disk_size="$DEFAULT_VM_DISK_SIZE"
local extract_only=0
while (("$#")); do
case "$1" in
-a | --arch)
echo "arch: $2"
os_cpu_arch="$2"
shift 2
;;
-r | --ram-size | --ram_size)
echo "ram_size: $2"
ram_size="$2"
shift 2
;;
-d | --disk-size | --disk_size)
echo "disk_size: $2"
disk_size="$2"
shift 2
;;
-t | --threads)
echo "threads: $2"
threads="$2"
shift 2
;;
-c | --cores)
echo "cores: $2"
cores="$2"
shift 2
;;
-v | --version)
echo "version: $2"
os_version="$2"
shift 2
;;
-e | --extract-only)
extract_only=1
shift
;;
-h | --help)
echo "help"
shift
exit 1
;;
*)
echo "error: unknown arguments"
echo "help"
shift
exit
;;
esac
done
# Example: https://dl-cdn.alpinelinux.org/alpine/latest-stable/releases/x86_64/
OS_ISO_URL="https://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/releases/${os_cpu_arch}/alpine-standard-${os_version}-${os_cpu_arch}.iso"
OS_ISO_NAME="alpine-standard-${os_version}-${os_cpu_arch}.iso"
OS_REPO="http://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/main/"
OS_MODLOOP_URL="http://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/releases/${os_cpu_arch}/netboot/modloop-lts"
VMLINUZ_URL="https://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/releases/${os_cpu_arch}/netboot/vmlinuz-lts"
INITRAMFS_URL="https://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/releases/${os_cpu_arch}/netboot/initramfs-lts"
VM_DIR_NAME="${OS_DISTRO}-${OS_CHANNEL}-${os_version}-${os_cpu_arch}"
VM_DIR_NAME_HASH="$(sha_256_digest $OS_ISO_URL)"
VM_DISK_DIR="$VM_DIR/$VM_DIR_NAME_HASH-$VM_DIR_NAME"
VM_DISK=$VM_DISK_DIR/disk.qcow2
# Prapare disk.
mkdir -p $VM_DISK_DIR
if [ ! -f $VM_DISK ]; then
qemu-img create -f qcow2 $VM_DISK $disk_size
fi
#TMP_DIR=$(mktemp -d 2>/dev/null || mktemp -d -t '${OS_DISTRO}-${OS_CHANNEL}-${os_cpu_arch}')
TMP_DIR="/tmp/$VM_DIR_NAME_HASH"
mkdir -p $TMP_DIR
# Now build the command line to run qemu.
QEMU=qemu-system-${os_cpu_arch}
QEMU_ARGS=(
-smp "cores=$cores,threads=$threads"
-m $ram_size
-hda $VM_DISK
)
if [ "y$HOST_CPU_ARCH" == "y$os_cpu_arch" ]; then
QEMU_ARGS+=(
#-cpu host
#-accel hvf
-enable-kvm
-hda $VM_DISK
-nic user
-boot d
-cdrom $TMP_DIR/$OS_ISO_NAME
)
wget -P $TMP_DIR/ -c $OS_ISO_URL
$QEMU "${QEMU_ARGS[@]}"
elif [ "yaarch64" == "y$os_cpu_arch" ]; then
QEMU_ARGS+=(
-M virt
-cpu cortex-a72
-initrd $TMP_DIR/initramfs-lts
-kernel $TMP_DIR/vmlinuz-lts
--append "console=ttyAMA0 ip=dhcp alpine_repo=$OS_REPO modloop=$OS_MODLOOP_URL"
-netdev user,id=unet
-device virtio-net-device,netdev=unet
-net user
-nographic
)
if [ $extract_only == 0 ]; then
wget -P $TMP_DIR/ -c $VMLINUZ_URL
wget -P $TMP_DIR/ -c $INITRAMFS_URL
$QEMU "${QEMU_ARGS[@]}"
fi
sudo modprobe nbd max_part=8
sudo qemu-nbd --connect=/dev/nbd0 $VM_DISK
mkdir -p $TMP_DIR/mnt/
sudo mount /dev/nbd0p1 $TMP_DIR/mnt/
sudo chmod a+r $TMP_DIR/mnt/initramfs-lts
sudo chmod a+r $TMP_DIR/mnt/vmlinuz-lts
cp $TMP_DIR/mnt/vmlinuz-lts $VM_DISK_DIR/vmlinuz-lts.img
cp $TMP_DIR/mnt/initramfs-lts $VM_DISK_DIR/initramfs-lts.img
sudo umount /dev/nbd0p1
sudo nbd-client -d /dev/nbd0
sudo modprobe -r nbd
fi
# Clean up.
# Disable this while testing to prevent getting throttled by the CDN.
#rm -rf $TMP_DIR
}
function start_vm() {
local os_cpu_arch="$HOST_CPU_ARCH"
local os_version="$OS_VERSION"
local threads=$DEFAULT_VM_THREADS
local cores=$DEFAULT_VM_CORES
local ram_size="$DEFAULT_VM_RAM_SIZE"
while (("$#")); do
case "$1" in
-a | --arch)
echo "arch: $2"
os_cpu_arch="$2"
shift 2
;;
-r | --ram-size | --ram_size)
echo "ram_size: $2"
ram_size="$2"
shift 2
;;
-t | --threads)
echo "threads: $2"
threads="$2"
shift 2
;;
-c | --cores)
echo "cores: $2"
cores="$2"
shift 2
;;
-h | --help)
echo "help"
shift
exit 1
;;
*)
echo "error: unknown arguments"
echo "help"
shift
exit
;;
esac
done
# Example: https://dl-cdn.alpinelinux.org/alpine/latest-stable/releases/x86_64/
OS_ISO_URL="https://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/releases/${os_cpu_arch}/alpine-standard-${os_version}-${os_cpu_arch}.iso"
OS_ISO_NAME="alpine-standard-${os_version}-${os_cpu_arch}.iso"
OS_REPO="http://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/main/"
OS_MODLOOP_URL="http://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/releases/${os_cpu_arch}/netboot/modloop-lts"
VMLINUZ_URL="https://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/releases/${os_cpu_arch}/netboot/vmlinuz-lts"
INITRAMFS_URL="https://dl-cdn.alpinelinux.org/alpine/${OS_CHANNEL}/releases/${os_cpu_arch}/netboot/initramfs-lts"
VM_DIR_NAME="${OS_DISTRO}-${OS_CHANNEL}-${os_version}-${os_cpu_arch}"
VM_DIR_NAME_HASH="$(sha_256_digest $OS_ISO_URL)"
VM_DISK_DIR="$VM_DIR/$VM_DIR_NAME_HASH-$VM_DIR_NAME"
VM_DISK=$VM_DISK_DIR/disk.qcow2
# Now build the command line to run qemu.
QEMU=qemu-system-${os_cpu_arch}
QEMU_ARGS=(
-smp "cores=$cores,threads=$threads"
-m $ram_size
-hda $VM_DISK
-nic user
-boot c
-nic user,hostfwd=$SSHD_HOST_FWD
)
if [ "y$HOST_CPU_ARCH" == "y$os_cpu_arch" ]; then
QEMU_ARGS+=(
-enable-kvm
)
elif [ "yaarch64" == "y$os_cpu_arch" ]; then
# See: https://qemu-project.gitlab.io/qemu/system/linuxboot.html
QEMU_ARGS+=(
-M virt
-cpu cortex-a72
-initrd $VM_DISK_DIR/initramfs-lts.img
-kernel $VM_DISK_DIR/vmlinuz-lts.img
--append "console=ttyAMA0 root=/dev/vda3 rw rootfstype=ext4"
-nographic
)
fi
$QEMU "${QEMU_ARGS[@]}"
}
show_usage() {
echo "usage"
}
if [ $# -eq 0 ]; then
show_usage
exit 1
fi
while (("$#")); do
case "$1" in
setup)
shift
setup_vm "$@"
exit
;;
start)
shift
start_vm "$@"
exit
;;
-h | --help | help)
shift
show_usage
exit 0
;;
*)
echo
show_usage
exit 1
;;
esac
done
exit 0
This appears to be reproducible on Clear Linux running on an x86_64 machine (a framework laptop) as well.
Earlier on the same system running Fedora 37, bazel built alright. The file system in use was btrfs then and now it's using ext4. That's probably the main difference in configuration.
❯ uname -a
Linux ghostname 6.1.7-1247.native #1 SMP Wed Jan 18 08:32:41 PST 2023 x86_64 GNU/Linux
❯ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 916G 42G 828G 5% /
devtmpfs 32G 0 32G 0% /dev
tmpfs 32G 106M 32G 1% /dev/shm
tmpfs 13G 2.7M 13G 1% /run
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
tmpfs 32G 957M 31G 3% /tmp
clr_debug_fuse 916G 42G 828G 5% /usr/lib/debug
clr_debug_fuse 916G 42G 828G 5% /usr/src/debug
tmpfs 6.3G 6.9M 6.3G 1% /run/user/1000
❯ fdisk -l
Disk /dev/nvme0n1: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 1048576 bytes
Disklabel type: gpt
Disk identifier: 49CB04F8-1A89-48DB-8811-40A543D4865A
Device Start End Sectors Size Type
/dev/nvme0n1p1 2048 307199 305152 149M EFI System
/dev/nvme0n1p2 307200 1953523711 1953216512 931.4G Linux root (x86-64)
The build process is blocked waiting to read something that is perhaps unavailable/stuck in an indefinite loop. I'm building this in tmpfs. would that affect the process?
❯ env JAVA_HOME="$JAVA_HOME" \
EXTRA_BAZEL_ARGS="--tool_java_runtime_version=local_jdk" \
strace bash ./compile.sh
pipe2([3, 4], 0) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_BLOCK, [INT TERM CHLD], [], 8) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x55ebf1e45a10) = 165935
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigaction(SIGCHLD, {sa_handler=0x55ebf20e7a02, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x55ebf1e82a60}, {sa_handler=0x55ebf20e7a02, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x55ebf1e82a60}, 8) = 0
close(4) = 0
rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
read(3,
Update:
On my machine the compile.sh
script was blocked waiting with the strace
log above. As soon as I set JAVA_HOME
to (clear linux-specific):
/usr/lib/jvm/java-1.11.0-openjdk
it worked and started to build. The /usr/bin/java
binary on the OS appears to just hang when invoked at the command line. So I'm not sure whether you have the same problem.
❯ java -version
(not responding)
❯ strace java -version
... looks like an indefinite loop ...
On an aarch64 VM an strace
log shows:
munmap(0xffff077e8000, 24576) = 0
mmap(NULL, 45056, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff077d7000
munmap(0xffff077e2000, 24576) = 0
madvise(0xffff077d8000, 16384, MADV_FREE) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_BLOCK, [INT TERM CHLD], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [INT TERM CHLD], 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[], ~[KILL STOP RTMIN RT_1 RT_2], 8) = 0
clone(child_stack=NULL, flags=SIGCHLD) = 9780
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1 RT_2], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [INT TERM CHLD], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
madvise(0xffff077dd000, 16384, MADV_FREE) = 0
munmap(0xffff077d7000, 45056) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {sa_handler=0xaaace15d5b50, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffff079a5a90}, {sa_handler=0xaaace15f4110, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffff079a5a90}, 8) = 0
wait4(-1,
Still gets stuck on "Patching repository....". I will check one of the EC2 instances as well
strace
(example below) and paste the tail of the log here to determine where the compile process is blocked waiting? (You may have to install strace
.)❯ doas apk add strace
❯ env JAVA_HOME="<your java home>" \
EXTRA_BAZEL_ARGS="--tool_java_runtime_version=local_jdk" \
strace bash ./compile.sh
❯ strace java -version
@yesudeep Hello, I just got around to this:
strace java -version
output:
mprotect(0xffffb63ce000, 634880, PROT_READ) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [], 8) = 0
membarrier(MEMBARRIER_CMD_PRIVATE_EXPEDITED, 0) = -1 EPERM (Operation not permitted)
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], ~[KILL STOP RTMIN RT_1 RT_2], 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 8) = 0
rt_sigaction(SIGRT_2, {sa_handler=0xffffb657a0e4, sa_mask=~[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0xffffb65a2e64}, NULL, 8) = 0
rt_sigaction(SIGRT_2, {sa_handler=SIG_IGN, sa_mask=~[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0xffffb65a2e64}, NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1 RT_2], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
getpid() = 3275
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffffb65fc000
mmap(NULL, 73728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffffb555b000
getpid() = 3275
rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 8) = 0
membarrier(MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, 0) = 0
mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffffb535a000
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [], 8) = 0
clone(child_stack=0xffffb555aab0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|0x400000, parent_tid=[3276], tls=0xffffb555aba8, child_tidptr=0xffffb6603230) = 3276
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0xffffb555ab08, FUTEX_WAIT_PRIVATE, 2, NULLopenjdk version "11.0.18" 2023-01-17
OpenJDK Runtime Environment (build 11.0.18+10-alpine-r0)
OpenJDK 64-Bit Server VM (build 11.0.18+10-alpine-r0, mixed mode)
) = 0
munmap(0xffffb535a000, 2101248) = 0
exit_group(0) = ?
+++ exited with 0 +++
Here is the tail of the output of running the compilation where it gets stuck:
munmap(0xffffb12e6000, 4096) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_BLOCK, [INT TERM CHLD], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [INT TERM CHLD], 8) = 0
rt_sigprocmask(SIG_SETMASK, [INT TERM CHLD], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [INT TERM CHLD], 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[], ~[KILL STOP RTMIN RT_1 RT_2], 8) = 0
clone(child_stack=NULL, flags=SIGCHLD) = 4325
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1 RT_2], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [INT TERM CHLD], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
munmap(0xffffb12e7000, 8192) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {sa_handler=0xaaaad1d94b1c, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffb1406e64}, {sa_handler=0xaaaad1db0698, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffb1406e64}, 8) = 0
I'm also experiencing this error sporadically while trying to build Bazel 6.0.0 for arm64 in Alpine for arm64 under QEMU, although it strangely succeeded once for me in CI. Locally it seems to fail at a slightly different place each time. The following Dockerfile hangs for me:
FROM alpine:3.17
RUN apk update && \
apk add --no-cache \
bash \
build-base \
curl \
linux-headers \
openjdk11-jdk \
python3 \
strace \
unzip \
zip
# Build Bazel
# TODO: Remove when Bazel 5.2.0+ is available in Alpine
# https://github.com/bazelbuild/bazel/pull/14391
ARG BAZEL_VERSION=6.0.0
RUN mkdir -p /tmp/bazel-release
WORKDIR /tmp/bazel-release
RUN curl -sSLO https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-dist.zip && unzip -q bazel-${BAZEL_VERSION}-dist.zip
RUN env JAVA_HOME="/usr/lib/jvm/java-11-openjdk" EXTRA_BAZEL_ARGS="--tool_java_runtime_version=local_jdk --curses=no" bash ./compile.sh
RUN install -D output/bazel /usr/local/bin/bazel
After some time, CPU and network usage go to zero but the command never exits. I can reproduce this in QEMU on x86_64 and on native arm64 AWS Graviton2 CPUs. I tried starting the Alpine image from scratch and adding strace to a few commands (this doesn't work when using QEMU, it works on native arm64 Docker only):
/tmp/bazel-release # strace java --version
execve("/usr/bin/java", ["java", "--version"], 0xffffc069d098 /* 7 vars */) = 0
set_tid_address(0xffff9b754230) = 31
brk(NULL) = 0xaaaae81f2000
brk(0xaaaae81f4000) = 0xaaaae81f4000
mmap(0xaaaae81f2000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xaaaae81f2000
readlinkat(AT_FDCWD, "/proc/self/exe", "/usr/lib/jvm/java-11-openjdk/bin"..., 512) = 37
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/jli/libjli.so", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=67264, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 135168, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xffff9b681000
mmap(0xffff9b6a0000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xf000) = 0xffff9b6a0000
close(3) = 0
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/jli/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/jli/../libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/jli/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
fstat(3, {st_mode=S_IFREG|0755, st_size=132880, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 200704, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xffff9b650000
mmap(0xffff9b67f000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1f000) = 0xffff9b67f000
close(3) = 0
mprotect(0xffff9b6a0000, 4096, PROT_READ) = 0
mprotect(0xffff9b67f000, 4096, PROT_READ) = 0
mprotect(0xaaaad9f4f000, 4096, PROT_READ) = 0
readlinkat(AT_FDCWD, "/proc/self/exe", "/usr/lib/jvm/java-11-openjdk/bin"..., 4096) = 37
faccessat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/libjava.so", F_OK) = 0
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/jvm.cfg", O_RDONLY|O_LARGEFILE) = 3
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff9b74d000
read(3, "-server KNOWN\n-client IGNORE\n", 1024) = 29
read(3, "", 1024) = 0
close(3) = 0
munmap(0xffff9b74d000, 4096) = 0
newfstatat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/server/libjvm.so", {st_mode=S_IFREG|0644, st_size=15872104, ...}, 0) = 0
execve("/usr/lib/jvm/java-11-openjdk/bin/java", ["java", "--version"], 0xaaaad9f50a80 /* 8 vars */) = 0
set_tid_address(0xffff9ccb5230) = 31
brk(NULL) = 0xaaaad7a36000
brk(0xaaaad7a38000) = 0xaaaad7a38000
mmap(0xaaaad7a36000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xaaaad7a36000
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/server/libjli.so", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/libjli.so", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/../lib/libjli.so", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
readlinkat(AT_FDCWD, "/proc/self/exe", "/usr/lib/jvm/java-11-openjdk/bin"..., 512) = 37
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/jli/libjli.so", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=67264, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 135168, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xffff9cbe2000
mmap(0xffff9cc01000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xf000) = 0xffff9cc01000
close(3) = 0
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/server/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/../lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/jli/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/jli/../libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/jli/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/bin/../lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld-musl-aarch64.path", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/libz.so.1", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
fstat(3, {st_mode=S_IFREG|0755, st_size=132880, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 200704, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xffff9cbb1000
mmap(0xffff9cbe0000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1f000) = 0xffff9cbe0000
close(3) = 0
mprotect(0xffff9cc01000, 4096, PROT_READ) = 0
mprotect(0xffff9cbe0000, 4096, PROT_READ) = 0
mprotect(0xaaaace7df000, 4096, PROT_READ) = 0
readlinkat(AT_FDCWD, "/proc/self/exe", "/usr/lib/jvm/java-11-openjdk/bin"..., 4096) = 37
faccessat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/libjava.so", F_OK) = 0
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/jvm.cfg", O_RDONLY|O_LARGEFILE) = 3
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff9ccae000
read(3, "-server KNOWN\n-client IGNORE\n", 1024) = 29
read(3, "", 1024) = 0
close(3) = 0
munmap(0xffff9ccae000, 4096) = 0
newfstatat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/server/libjvm.so", {st_mode=S_IFREG|0644, st_size=15872104, ...}, 0) = 0
openat(AT_FDCWD, "/usr/lib/jvm/java-11-openjdk/lib/server/libjvm.so", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=15872104, ...}) = 0
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 960) = 960
mmap(NULL, 16343040, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xffff9bc1b000
mmap(0xffff9ca70000, 1314816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xe55000) = 0xffff9ca70000
mmap(0xffff9cb3e000, 471040, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xffff9cb3e000
close(3) = 0
mprotect(0xffff9ca70000, 634880, PROT_READ) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [], 8) = 0
membarrier(MEMBARRIER_CMD_PRIVATE_EXPEDITED, 0) = -1 EPERM (Operation not permitted)
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], ~[KILL STOP RTMIN RT_1 RT_2], 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 8) = 0
rt_sigaction(SIGRT_2, {sa_handler=0xffff9cc273c4, sa_mask=~[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0xffff9cc52a90}, NULL, 8) = 0
rt_sigaction(SIGRT_2, {sa_handler=SIG_IGN, sa_mask=~[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0xffff9cc52a90}, NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1 RT_2], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
getpid() = 31
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff9ccae000
mmap(NULL, 73728, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff9bc09000
getpid() = 31
rt_sigprocmask(SIG_UNBLOCK, [RT_1 RT_2], NULL, 8) = 0
membarrier(MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, 0) = 0
mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffff9ba08000
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [], 8) = 0
clone(child_stack=0xffff9bc08ab0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|0x400000, parent_tid=[32], tls=0xffff9bc08ba8, child_tidptr=0xffff9ccb5230) = 32
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0xffff9bc08b08, FUTEX_WAIT_PRIVATE, 2, NULLopenjdk 11.0.18 2023-01-17
OpenJDK Runtime Environment (build 11.0.18+10-alpine-r0)
OpenJDK 64-Bit Server VM (build 11.0.18+10-alpine-r0, mixed mode)
) = 0
munmap(0xffff9ba08000, 2101248) = 0
exit_group(0) = ?
+++ exited with 0 +++
Tail of env JAVA_HOME="/usr/lib/jvm/java-11-openjdk" EXTRA_BAZEL_ARGS="--tool_java_runtime_version=local_jdk --curses=no" strace bash ./compile.sh
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=2212, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG, NULL) = 2212
wait4(-1, 0xffffcd31a0d0, WNOHANG, NULL) = -1 ECHILD (No child process)
rt_sigreturn({mask=[INT]}) = 19
read(3, "", 4096) = 0
close(3) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {sa_handler=0xaaaaacc75b50, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, {sa_handler=0xaaaaacc94110, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0xaaaaacc94110, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, {sa_handler=0xaaaaacc75b50, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
pipe2([3, 4], 0) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_BLOCK, [INT TERM CHLD], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [INT TERM CHLD], 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[], ~[KILL STOP RTMIN RT_1 RT_2], 8) = 0
clone(child_stack=NULL, flags=SIGCHLD) = 2213
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1 RT_2], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [INT TERM CHLD], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigaction(SIGCHLD, {sa_handler=0xaaaaacc78d34, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0xffffa9c4ca90}, {sa_handler=0xaaaaacc78d34, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0xffffa9c4ca90}, 8) = 0
close(4) = 0
rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=2213, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG, NULL) = 2213
wait4(-1, 0xffffcd31b1f0, WNOHANG, NULL) = -1 ECHILD (No child process)
rt_sigreturn({mask=[INT]}) = 0
read(3, "/tmp/bazel-release\n", 4096) = 19
read(3, "", 4096) = 0
close(3) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {sa_handler=0xaaaaacc75b50, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, {sa_handler=0xaaaaacc94110, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0xaaaaacc94110, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, {sa_handler=0xaaaaacc75b50, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_BLOCK, [INT TERM CHLD], [], 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [INT TERM CHLD], 8) = 0
rt_sigprocmask(SIG_SETMASK, [INT TERM CHLD], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [INT TERM CHLD], 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[], ~[KILL STOP RTMIN RT_1 RT_2], 8) = 0
clone(child_stack=NULL, flags=SIGCHLD) = 2214
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1 RT_2], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [INT TERM CHLD], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {sa_handler=0xaaaaacc75b50, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, {sa_handler=0xaaaaacc94110, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0xffffa9c4ca90}, 8) = 0
wait4(-1,
I don't really understand any of this, but I noticed child processes are exiting in a sequence like 2212, 2213, etc. When this output was displayed, the container was running the following processes, including 2214 (which I think is hung):
/ # ps aux | grep jvm
998 root 0:18 /usr/lib/jvm/java-11-openjdk/bin/java -XX:+HeapDumpOnOutOfMemoryError -Xverify:none -Dfile.encoding=ISO-8859-1 -XX:HeapDumpPath=/tmp/bazel_XXgEemBN -Djava.util.logging.config.file=/tmp/bazel_XXgEemBN/javalog.properties -jar /tmp/bazel_XXgEemBN/archive/libblaze.jar --batch --install_base=/tmp/bazel_XXgEemBN/archive --output_base=/tmp/bazel_XXgEemBN/out --failure_detail_out=/tmp/bazel_XXgEemBN/failure_detail.rawproto --output_user_root=/tmp/bazel_XXgEemBN/user_root --install_md5= --default_system_javabase=/usr/lib/jvm/java-11-openjdk --workspace_directory=/tmp/bazel-release --nofatal_event_bus_exceptions build --ignore_unsupported_sandboxing --startup_time=329 --extract_data_time=523 --rc_source=/dev/null --isatty=1 --build_python_zip --client_env=PWD=/tmp/bazel-release --client_env=JAVA_HOME=/usr/lib/jvm/java-11-openjdk --client_env=SHLVL=2 --client_env=HOME=/root --client_env=HOSTNAME=418b04cc98ce --client_env=TERM=xterm --client_env=OLDPWD=/tmp/bazel-release --client_env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --client_env=EXTRA_BAZEL_ARGS=--tool_java_runtime_version=local_jdk --curses=no --client_cwd=/tmp/bazel-release --spawn_strategy=standalone --nojava_header_compilation --strategy=Javac=worker --worker_quit_after_build --ignore_unsupported_sandboxing --compilation_mode=opt --distdir=derived/distdir --extra_toolchains=//scripts/bootstrap:bootstrap_toolchain_definition --tool_java_runtime_version=local_jdk --curses=no --verbose_failures --javacopt=-g -source 11 -target 11 --stamp --embed_label 6.0.0- (@non-git) src:bazel_nojdk --action_env=PATH --host_platform=@local_config_platform//:host --platforms=@local_config_platform//:host
1257 root 0:00 {skyframe-evalua} /usr/lib/jvm/java-11-openjdk/bin/java -XX:+HeapDumpOnOutOfMemoryError -Xverify:none -Dfile.encoding=ISO-8859-1 -XX:HeapDumpPath=/tmp/bazel_XXgEemBN -Djava.util.logging.config.file=/tmp/bazel_XXgEemBN/javalog.properties -jar /tmp/bazel_XXgEemBN/archive/libblaze.jar --batch --install_base=/tmp/bazel_XXgEemBN/archive --output_base=/tmp/bazel_XXgEemBN/out --failure_detail_out=/tmp/bazel_XXgEemBN/failure_detail.rawproto --output_user_root=/tmp/bazel_XXgEemBN/user_root --install_md5= --default_system_javabase=/usr/lib/jvm/java-11-openjdk --workspace_directory=/tmp/bazel-release --nofatal_event_bus_exceptions build --ignore_unsupported_sandboxing --startup_time=329 --extract_data_time=523 --rc_source=/dev/null --isatty=1 --build_python_zip --client_env=PWD=/tmp/bazel-release --client_env=JAVA_HOME=/usr/lib/jvm/java-11-openjdk --client_env=SHLVL=2 --client_env=HOME=/root --client_env=HOSTNAME=418b04cc98ce --client_env=TERM=xterm --client_env=OLDPWD=/tmp/bazel-release --client_env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --client_env=EXTRA_BAZEL_ARGS=--tool_java_runtime_version=local_jdk --curses=no --client_cwd=/tmp/bazel-release --spawn_strategy=standalone --nojava_header_compilation --strategy=Javac=worker --worker_quit_after_build --ignore_unsupported_sandboxing --compilation_mode=opt --distdir=derived/distdir --extra_toolchains=//scripts/bootstrap:bootstrap_toolchain_definition --tool_java_runtime_version=local_jdk --curses=no --verbose_failures --javacopt=-g -source 11 -target 11 --stamp --embed_label 6.0.0- (@non-git) src:bazel_nojdk --action_env=PATH --host_platform=@local_config_platform//:host --platforms=@local_config_platform//:host
2214 root 0:08 /usr/lib/jvm/java-11-openjdk/bin/java -XX:+HeapDumpOnOutOfMemoryError -Xverify:none -Dfile.encoding=ISO-8859-1 -XX:HeapDumpPath=/tmp/bazel_XXDODcEg -Djava.util.logging.config.file=/tmp/bazel_XXDODcEg/javalog.properties -jar /tmp/bazel_XXDODcEg/archive/libblaze.jar --batch --install_base=/tmp/bazel_XXDODcEg/archive --output_base=/tmp/bazel_XXDODcEg/out --failure_detail_out=/tmp/bazel_XXDODcEg/failure_detail.rawproto --output_user_root=/tmp/bazel_XXDODcEg/user_root --install_md5= --default_system_javabase=/usr/lib/jvm/java-11-openjdk --workspace_directory=/tmp/bazel-release --nofatal_event_bus_exceptions build --ignore_unsupported_sandboxing --startup_time=329 --extract_data_time=523 --rc_source=/dev/null --isatty=1 --build_python_zip --client_env=PWD=/tmp/bazel-release --client_env=JAVA_HOME=/usr/lib/jvm/java-11-openjdk --client_env=SHLVL=2 --client_env=HOME=/root --client_env=HOSTNAME=418b04cc98ce --client_env=TERM=xterm --client_env=OLDPWD=/tmp/bazel-release --client_env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --client_env=EXTRA_BAZEL_ARGS=--tool_java_runtime_version=local_jdk --curses=no --client_cwd=/tmp/bazel-release --spawn_strategy=standalone --nojava_header_compilation --strategy=Javac=worker --worker_quit_after_build --ignore_unsupported_sandboxing --compilation_mode=opt --distdir=derived/distdir --extra_toolchains=//scripts/bootstrap:bootstrap_toolchain_definition --tool_java_runtime_version=local_jdk --curses=no --verbose_failures --javacopt=-g -source 11 -target 11 --stamp --embed_label 6.0.0- (@non-git) src:bazel_nojdk --action_env=PATH --host_platform=@local_config_platform//:host --platforms=@local_config_platform//:host
2461 root 0:00 {skyframe-evalua} /usr/lib/jvm/java-11-openjdk/bin/java -XX:+HeapDumpOnOutOfMemoryError -Xverify:none -Dfile.encoding=ISO-8859-1 -XX:HeapDumpPath=/tmp/bazel_XXDODcEg -Djava.util.logging.config.file=/tmp/bazel_XXDODcEg/javalog.properties -jar /tmp/bazel_XXDODcEg/archive/libblaze.jar --batch --install_base=/tmp/bazel_XXDODcEg/archive --output_base=/tmp/bazel_XXDODcEg/out --failure_detail_out=/tmp/bazel_XXDODcEg/failure_detail.rawproto --output_user_root=/tmp/bazel_XXDODcEg/user_root --install_md5= --default_system_javabase=/usr/lib/jvm/java-11-openjdk --workspace_directory=/tmp/bazel-release --nofatal_event_bus_exceptions build --ignore_unsupported_sandboxing --startup_time=329 --extract_data_time=523 --rc_source=/dev/null --isatty=1 --build_python_zip --client_env=PWD=/tmp/bazel-release --client_env=JAVA_HOME=/usr/lib/jvm/java-11-openjdk --client_env=SHLVL=2 --client_env=HOME=/root --client_env=HOSTNAME=418b04cc98ce --client_env=TERM=xterm --client_env=OLDPWD=/tmp/bazel-release --client_env=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin --client_env=EXTRA_BAZEL_ARGS=--tool_java_runtime_version=local_jdk --curses=no --client_cwd=/tmp/bazel-release --spawn_strategy=standalone --nojava_header_compilation --strategy=Javac=worker --worker_quit_after_build --ignore_unsupported_sandboxing --compilation_mode=opt --distdir=derived/distdir --extra_toolchains=//scripts/bootstrap:bootstrap_toolchain_definition --tool_java_runtime_version=local_jdk --curses=no --verbose_failures --javacopt=-g -source 11 -target 11 --stamp --embed_label 6.0.0- (@non-git) src:bazel_nojdk --action_env=PATH --host_platform=@local_config_platform//:host --platforms=@local_config_platform//:host
Is the space in --embed_label 6.0.0- (@non-git)
valid syntax?
@yesudeep I'm currently able to reliably reproduce this hang while building Bazel 6.0.0 under Alpine in QEMU in my local Fedora x86_64 with btrfs environment, but the exact same build succeeds (albeit very slowly) under Github Actions. Is there anything I can check to help narrow things down?
I can reproduce this with Bazel 6.1.0 on native ARM as well. I've uploaded the tail 4000 lines of strace output here: https://pastebin.com/abN6i5qa
Stepping inside the hanging container and viewing the contents of /tmp/bazel_08JYbuZU/phase
shows Building output/bazel
. The output to the user shows Fetching repository @bazelci_rules; Patching repository 95s
and the timer will keep increasing forever.
Full strace output including "Building Bazel from scratch", then hanging immediately after "Building Bazel with Bazel" (1.7MB txt file) can be downloaded here: https://drive.google.com/file/d/1m_dPN3xYRvNT8_f-k6KtKgt7K8swCHBx
@yesudeep @meteorcloudy @seanmor5 @strophy @sgowroji I build it on mips64le, error for somethings
root@ed7d09768525:~# env EXTRA_BAZEL_ARGS="--tool_java_runtime_version=local_jdk" BAZEL_JAVAC_OPTS="-J-Xms1g -J-Xmx64g" bash ./compile.sh --jobs=10
🍃 Building Bazel from scratch.. ....
🍃 Building Bazel with Bazel.
.OpenJDK 64-Bit Zero VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
Loading:
Fetching repository @bazelci_rules; Patching repository
INFO: Repository bazelci_rules instantiated at:
/root/WORKSPACE:258:18: in <toplevel>
/root/distdir.bzl:94:17: in dist_http_archive
Repository rule http_archive defined at:
/root/tools/build_defs/repo/http.bzl:372:31: in <toplevel>
ERROR: An error occurred during the fetch of repository 'bazelci_rules':
Traceback (most recent call last):
File "/root/tools/build_defs/repo/http.bzl", line 143, column 10, in _http_archive_impl
patch(ctx, auth = auth)
File "/root/tools/build_defs/repo/utils.bzl", line 193, column 21, in patch
fail("Error applying patch command %s:\n%s%s" %
Error in fail: Error applying patch command test -f BUILD && chmod u+w BUILD || true:
java.io.IOException: Cannot run program "bash" (in directory "/tmp/bazel_3r6NM4mZ/out/external/bazelci_rules"): error=0, Failed to exec spawn helper: pid: 6613, exit value: 1
ERROR: /root/WORKSPACE:258:18: fetching http_archive rule //external:bazelci_rules: Traceback (most recent call last):
File "/root/tools/build_defs/repo/http.bzl", line 143, column 10, in _http_archive_impl
patch(ctx, auth = auth)
File "/root/tools/build_defs/repo/utils.bzl", line 193, column 21, in patch
fail("Error applying patch command %s:\n%s%s" %
Error in fail: Error applying patch command test -f BUILD && chmod u+w BUILD || true:
java.io.IOException: Cannot run program "bash" (in directory "/tmp/bazel_3r6NM4mZ/out/external/bazelci_rules"): error=0, Failed to exec spawn helper: pid: 6613, exit value: 1
ERROR: Error computing the main repository mapping: no such package '@bazelci_rules//': Error applying patch command test -f BUILD && chmod u+w BUILD || true:
java.io.IOException: Cannot run program "bash" (in directory "/tmp/bazel_3r6NM4mZ/out/external/bazelci_rules"): error=0, Failed to exec spawn helper: pid: 6613, exit value: 1
Loading:
ERROR: Could not build Bazel
I tried building again with Bazel 7.2.1 and did not encounter this error anymore, allowing me to package Bazel for Alpine here: https://pkgs.alpinelinux.org/package/edge/testing/aarch64/bazel7
Can anyone else confirm this is no longer an issue?
Description of the bug:
Hi there, awhile ago I opened: https://github.com/bazelbuild/bazel/issues/16484
I was able to get around the issue by running the container on an x86 Linux machine. I am now trying to do the same thing with aarch64. I assumed the issue was exclusive to Docker just being bad on Mac; however, I am running into the same issue on EC2. I've tried various versions of Alpine and Bazel (5.3, 6.0) with no success. This is what I run to bootstrap:
I've tried this on these EC2 AMIs, as well as on a Raspberry Pi with Alpine 3.16 installed:
alpine-ami-3.14.2-aarch64-r0 ami-00604621aea32b1f5
alpine-3.16.0-x86_64-bios-cloudinit-r0 ami-0c9f21a3f1772d2d8
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
This is what I've been using to bootstrap:
Which operating system are you running Bazel on?
Alpine
What is the output of
bazel info release
?n/a
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.See above
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
Most of the time is just hangs at:
Or something like
patching repository
for one of the first few packages