Open kenballus opened 2 weeks ago
another unlikely possibility: 2 bits are wasted on randomizing 4k aligned addresses instead of 16k
no, the section alignment is anyway 64k and the addresses are aligned to that
I can replicate this result. Just as a sanity check of the experimental setup, I tried replacing the subprocess with addr = secrets.randbits(18)
and confirmed that it produced the expected result of ~0.5.
Here's my optimised version of the experiment, which spawns fewer subprocesses and therefore runs about 5x faster: (not hugely important but it makes testing things faster)
import multiprocessing
import subprocess
# Number of times we run the experiment
ITERATIONS: int = 1000
# Number of addresses we collect per experiment
SAMPLE_SIZE: int = 603
def experiment(sample_size: int) -> bool:
"""Grabs the libc base address sample_size times, returns whether any of them are duplicates"""
seen: set[int] = set()
for _ in range(sample_size):
lines = subprocess.run(
["cat", "/proc/self/maps"],
capture_output=True,
).stdout.decode().split("\n")
addr = [int(l.split("-")[0], 16) for l in lines if "libc" in l][0]
if addr in seen:
return True
seen.add(addr)
return False
with multiprocessing.Pool(multiprocessing.cpu_count() // 2) as pool:
print(sum(pool.map(experiment, [SAMPLE_SIZE] * ITERATIONS)) / ITERATIONS)
I noticed checking the addresses of [stack]
and /usr/bin/cat
provide similar results, whereas [heap]
actually does produce the expected value of ~0.5.
By observation, [heap]
(which I think gets mmapped by libc?) appears to be 16k aligned rather than 64k aligned like the various LOAD segments. It does look like alignment might be something to do with it.
Unproven hypothesis: if you have a 16k kernel with 64k-aligned userland binaries, you throw away 2 bits of entropy relative to whatever vm.mmap_rnd_bits
is supposed to provide.
Thanks for the feedback.
I tried running the script on some other aarch64 platforms.
QEMU aarch64 (virt-9.0)
i. OS: ARMbian aarch64, Linux 6.6.47
ii. vm.mmap_rnd_bits
: 18
iii. Page size: 4k
iv. Collision rate for sample size of 151, and 100 iterations: 0.5
Raspberry Pi 3 Model B v1.2
i. OS: Raspberry Pi OS 11 aarch64, Linux 6.1.0
ii. vm.mmap_rnd_bits
: 18
iii. Page size: 4k
iv. Collision rate for sample size of 151, and 1000 iterations: 0.517
Raspberry Pi 3 Model B v1.2
i. OS: Raspberry Pi OS 10 aarch64, Linux 5.4.51
ii. vm.mmap_rnd_bits
: 18
iii. Page size: 4k
iv. Collision rate for sample size of 151, and 100 iterations: 0.46
The takeaway here is that systems with 4k pages have this same problem, but even worse by another ~2 bits. I'm going to keep going back in time on the rpi to see if this has always been the behavior, or if this is a regression.
@kenballus did the binaries you tested against on those systems have 4k-aligned segments, too?
If my hypothesis is correct, loading a 64k-aligned binary on a 4k kernel with vm.mmap_rnd_bits=18 will result in only ~14 bits of entropy.
(If you want a test binary, /usr/sbin/busybox
from asahi fedora should work)
(This might be expected behavior. Apologies in advance.)
I saw this blog post about ASLR behaving weirdly on recent x86 Linux kernels, and was curious if Asahi was also affected.
By default,
vm.mmap_rnd_bits
is 18 on Asahi Fedora 40. Thus, we should expect that there are something like $2^{18}$ equally-likely possible values for the base address of libc.Under that assumption, if we collect 603 libc base addresses, we should expect to see about a 50% chance of there being at least one duplicate among them[^1].
[^1]: This can be found by setting $n=2^{18}$ and solving for $k$ in this. You can check your work here.
I wrote this little Python script to collect a bunch of libc base addresses and check for duplicates, then report how often it hits at least one duplicate:
The math would lead us to expect this script to spit out something around
0.5
, but it reports something around0.9
on my M1 MBA. (Feel free to run it for yourself; only takes a minute or two). This would suggest that the distribution of libc base addresses has less than 18 bits of entropy.If you change
SAMPLE_SIZE
to302
, then the script will report a value close to0.5
. By the same math as before, this would suggest about 16 bits of effective entropy, instead of 18.I'm puzzled as to why we seem to observe only 25% as many possible addresses as we should. Some potential reasons:
I'd really appreciate it if anyone has any ideas about the cause of this, or could suggest a more appropriate place to bring this up.
Thanks!