Closed Vogtinator closed 1 year ago
Consider using ToyBox compiled with musl-cross-make.
master jart@nightmare:~/blink$ o//blink/blink ./toybox bc
>>> 2 + 2
4
>>> master jart@nightmare:~/blink$
master jart@nightmare:~/blink$ o//blink/blink ./toybox stat .
File: .
Size: 4096 Blocks: 8 IO Blocks: 512 directory
Device: 803h/2051d Inode: 57571047 Links: 10 Device type: 0,0
Access: (0755/drwxr-xr-x) Uid: ( 1000/ jart) Gid: ( 1000/ jart)
Access: 2023-01-17 10:22:59.000000000 -0800
Modify: 2023-01-17 10:22:41.000000000 -0800
Change: 2023-01-17 10:22:41.000000000 -0800
master jart@nightmare:~/blink$ PS1='>: ' o//blink/blink ./toybox sh
>: ls
HTAGS LICENSE Makefile README.md TAGS blink build o perf.data test third_party tool toybox
>: master jart@nightmare:~/blink$
master jart@nightmare:~/blink$
ToyBox was created by the guy who made BusyBox, as a noble public service, because he felt guilty about the GPL. So you could think of it as a second generation solution that learns from BusyBox's mistakes. It's real nice and I think Android uses it too. Like Blink and Cosmopolitan, Android doesn't support a lot of the weird kernel features Glibc requires apps depend on.
Prebuilt binary here: toybox.zip
For the record, I intend to have Blink support the features that Glibc needs. So I'm going to leave this open until that can happen. It's just going to take more time for that to happen for BusyBox, whereas ToyBox works great today.
Yeah, this is mostly a bug report for incomplete or inaccurate emulation in blink.
Which features are missing for this in particular? I wouldn't expect missing features to causes a SEGV in memcpy
.
I'm currently investigating it to learn more. The instructions that are faulting are supported and used by Cosmopolitan. Thanks for posting the binary. I'll report back when I learn more.
I had a look at the glibc code and played around in blinkenlights a bit. It looks like variables like __x86_shared_non_temporal_threshold
which are based on CPU cache sizes are set to 0 and memcpy
doesn't like that because it loops until the size is less than that:
https://github.com/bminor/glibc/blob/93967a2a7bbdcedb73e0b246713580c7c84d001e/sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S#L745
That makes a lot of sense. I'm so happy that Blinkenlights' TUI helped make that easy for you to find. It looks like our CPUID implementation advertises a max leaf of 7 but CPUID 4 isn't implemented (which reveals cache information). I'll start implementing that now.
I added the CPUID leafs but it's still having issues. I'm having fun rewinding from the point at which it goes past the end of allocated memory.
$ o//blink/blink -m ./busybox-static stat .
I2023-01-17T12:21:38.009117:blink/syscall.c:3309:127090 missing syscall 0x14e
I2023-01-17T12:21:38.011297:blink/throw.c:93:127090 SEGMENTATION FAULT AT ADDRESS 6bd440
PC 404820 movntdq %xmm4,0x1000(%rdi)
AX 000000000069c430 CX 0000000000000040 DX 00000000000000aa BX 0000000000636a68
SP 00004fffffffedc8 BP 00004ffffffff373 SI 0000000000656ab8 DI 00000000006bc440
R8 fffffffffffffff0 R9 0000000000000000 R10 fffffffffffffff8 R11 0000000000000000
R12 00004fffffffee30 R13 00004ffffffff1a8 R14 0000000000000001 R15 0000000000000001
FS 000000000069b3c0 GS 0000000000000000 OPS 4603 JIT 0
./busybox-static
4ffffffff373 000000404820 UNKNOWN 1451 bytes
3d4c4c454853002e 7361622f6e69622f UNKNOWN [MISALIGN] [CORRUPT FRAME POINTER]
000000400000-000000400fff 4096 100% r
000000401000-0000005f5fff 2004k 100% rx
0000005f6000-000000684fff 572k 100% r
000000685000-00000069afff 88k 50% rw
00000069b000-0000006bcfff 136k 100% rwx
4fffff800000-4fffffffffff 8192k 1% rw
I2023-01-17T12:21:38.011335:blink/blink.c:67:127090 terminating due to signal SIGSEGV
Segmentation fault
I added the CPUID leafs but it's still having issues. I'm having fun rewinding from the point at which it goes past the end of allocated memory.
Apparently the cache info is vendor specific and glibc doesn't know what to do with GenuineBlink
. By using GenuineIntel
it works :-/
Wow. That's almost as bad as the uname("Blink 4.0") thing I needed to do. I'm going to need to rebuild a lot of Cosmo programs but I'm glad we spotted this sooner rather than later. Thanks for troubleshooting this.
Looks like this issue was reported two weeks ago and is already fixed in glibc git: https://sourceware.org/bugzilla/show_bug.cgi?id=29953
There are easy to reproduce segmentation faults when using a statically linked busybox with glibc: busybox.zip
bc
:stat
:sh
(tab completion, press tab twice on the prompt):They happen both on x86_64 Linux as well as in the WASM/emscripten version.