Closed maxice8 closed 2 years ago
This is not good.
Can you tell me what version of bc
you are testing, along with the version of Alpine and musl?
Oh, I forgot. Could you also tell me the compiler, its version, and the commands you are using to build bc
?
Version: 5.0.1 Alpine: Edge musl: 1.2.2-r5
gcc:
gcc (Alpine 10.3.1_git20210625) 10.3.1 20210625
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
commandds:
build() {
PREFIX=/usr DESTDIR="$pkgdir" EXECSUFFIX=-howard ./configure.sh -GN
make
}
check() {
make test
}
package() {
make install
}
Full log of the build and failure here: https://build.alpinelinux.org/buildlogs/build-edge-aarch64/testing/howard-bc/howard-bc-5.0.1-r0.log
Thank you. I'm trying to get a cross-compilation toolchain working and will debug the issue.
I have bad news: I can't reproduce.
I can't reproduce on x86_64
with gcc
and glibc. I can't reproduce on the same platform with the same musl. I finally got a cross-compilation toolchain working, and I can't reproduce it under QEMU.
I think that in order to reproduce this, I need to buy an aarch64
machine. This may take me a while.
Is there an imminent release for Alpine coming up?
Looking at the Alpine CI for aarch64 it passed completely, seems like only the builders have this failure.
There is no release imminent but it is always nice to keep stuff working and available
@ikke is there something different on the builders that could cause the crash ? maybe a blacklisted syscall ?
I can reproduce it in my aarch64 lxc container. It reaches this point:
Running dc error file tests/dc/errors/33.txt...pass
Running dc error file tests/dc/errors/33.txt through cat...
and then starts to use an unbounded amount of memory, and eventually gets killed by the oom-killer:
oom-kill:constraint=CONSTRAINT_CPUSET,nodemask=(null),cpuset=lxc.payload.build-edge-aarch64,mems_allowed=1,global_oom,task_memcg=/lxc.payload.ikke-edge-aarch64,task=dc,pid=49558,uid=1000
Out of memory: Killed process 49558 (dc) total-vm:819831256kB, anon-rss:162755232kB, file-rss:556kB, shmem-rss:0kB, UID:1000 pgtables:318760kB oom_score_adj:0
oom_reaper: reaped process 49558 (dc), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Hmm...I had this problem with FreeBSD too. What happens is that this test is meant to make sure that bc
does not crash when it can't allocate memory. Obviously, this test will fail when the OS lies too much about the memory it can give bc
.
I haven't had this problem on glibc, though, on a machine with 32GB of memory, so I don't know why it decided to have problems here.
I have a solution, though: I can accept a SIGKILL
as a passing result. I don't think that doing this will be a problem because SIGKILL
always has to come from outside, not from some problem in bc
itself (besides allocating too much memory, but that's a problem with the OS if it lies too much).
I will have a release out within the day with that change, if that is an acceptable solution to you both.
What is then most likely the issue is that even though the host has lots of memory, the containers are limited to the memory of just a single NUMA domain, so half the memory that is available on the host.
I have changed the test to not cause OOM conditions. I'm going to run my release regimen and release 5.0.2
for you.
5.0.2
is out. I hope this one works for you all!
Please reopen if it does not.
Thanks, can confirm that 5.0.2 is no longer failing.
The referred file
tests/dc/errors/33.txt
contains a binary file