Open GoogleCodeExporter opened 9 years ago
Note that Asan does not instrument mmap so catching signal and aborting is the
best it can do here. Clang's segfault is unexpected of course.
Original comment by tetra2...@gmail.com
on 29 Sep 2014 at 10:52
Thanks for replying.
I've noted there is mmap interceptor in compiler-rt/lib/tsan/ which could be
probably made shared with asan and via this way such kind of errors could be
detected more robustly.
Thanks again,
Kirill
Original comment by kirill.s...@gmail.com
on 29 Sep 2014 at 11:01
Can you print the page addresses to make sure the second access doesn't fall
into the first page (or any other mapped page)?
Regarding the mmap() interceptor, I think this is a half-measure that won't
help, for example, when mmap() is called from libc() via a local call, or when
the raw syscall is invoked.
IMO the best thing to do is to intectept __NR_mmap (together with other
syscalls) using seccomp-bpf.
Original comment by ramosian.glider@gmail.com
on 29 Sep 2014 at 11:30
Yes, the pages are adjacent to each other, because OS kernel tries to merge
mappings if possible, so that it is only 1 VMA internally:
(print patch)
--- a/page-access-beyond.c.kirr
+++ b/page-access-beyond.c
@@ -27,6 +27,9 @@ int main()
if (page2 == MAP_FAILED)
xabort("mmap2");
+ printf("p1: %p\n", page1);
+ printf("p2: %p\n", page2);
+
page1[0] = 0;
page1[pagesize-1] = 1;
(output)
p1: 0x7f05a5dd8000
p2: 0x7f05a5dd7000
and that's why asan does not complain.
But the point here is that it is still an error - because OS could put page1 and page2 at any place because mmaps specify addr=NULL which means "map anywhere.
Merging them is just kernel optimisation which does not work 100% of the time (e.g. if page1 was allocated at address space hole of only 1 page - there will be no chance for page2 to be allocated adjacent)
I think the solution here would be to one way or another intercept mmap, in interceptor first allocate 2-pages-larger address space with
map(NULL, PROT_NONE, MAP_NORESERVE, len=orig_len + 2*pagesize)
and then perform original mmap into inside of this address space region with MAP_FIXED.
This way there will be protective pages before and after valid mapping and detecting access beyond would work.
Thanks again, Kirill
Original comment by `kirill.s...@gmail.com` on 29 Sep 2014 at 11:56
tampering with mmap is risky.
If a user mmaps 32K small regions we will create 32K (or 64K?) redzone pages
and will exhaust the kernel limit of contiguous mappings.
This can be done under a flag though -- a patch is welcome.
Most of the code we care about does not use mmap directly so this
has never been a priority for us.
(glider, seccomp-bpf is a completely different story, let's not mix it here)
Original comment by konstant...@gmail.com
on 29 Sep 2014 at 9:15
Yeah, the limit on the number of mappings with different permissions appears to
be quite strict:
http://lxr.free-electrons.com/source/include/linux/sched/sysctl.h#L34 (that's
2^16-5).
IMO we'd better not change the behavior of mmap(), because there's a number of
corner cases when the users don't want that. If we do that solely for libc
(i.e. if __NR_mmap remains unintercepted) we'll have all sorts of mixed
{mmap,munmap}_{with,without}_redzone() calls for the same pages that'll end up
corrupting something.
Original comment by ramosian.glider@gmail.com
on 30 Sep 2014 at 9:53
Thanks for commenting. I understand about priorities etc. Just some notes about
limits and otherwise:
I've measured my running firefox:
$ cat /proc/`pidof iceweasel`/maps | wc -l
852
for 850 mappings adding 2 red-zone pages would be 1700 pages = < 7MB of address
space and 1700 additional VMAs. Which I think is rather low on 64bit, given
that it is only address space, not memory. Also it fits with good backup into
default limits for max VMA/process.
Even for all running processes _combined_ on my machine (x86_64 on Debian
testing, usual desktop) if such combined process could run:
# cat /proc/*/maps |wc -l
15053
that would result in only < 120MB of address space for red-zone pages and
~30000 additional VMAs, which to me justifies as low overhead for 64bit and
also fits into default limit for VMA/process.
So if a user creates special program which creates a lot of mappings he/she
should probably be already aware of
/proc/sys/vm/max_map_count
knob and _that_ is the flag to tune to run such a program with additional mmap
sanitizations.
So in ASAN I'd do it unconditionally.
IMO we'd better not change the behavior of mmap(), because there's a number of corner cases when the users don't want that
Could you please provide an example?
To me it looks like if mmap was without MAP_FIXED, the kernel is free to place the mapping anywhere and discovering that corner cases clients incorrectly rely on would be only good.
And if original mmap was with MAP_FIXED - just don't add red zone around it.
If we do that solely for libc (i.e. if __NRmmap remains unintercepted) we'll have all sorts of mixed {mmap,munmap}{with,without}_redzone() calls for the same pages that'll end up corrupting something.
On munmap see to which region it belongs, and if original region (which could be later changed with MAP_FIXED mapping into it, but anyway) was originally mmaped with red-zone, and we are unmapping adjacent to red-zone (either part), just adjust the red zone.
And if kernel __NR_mmap is called with correct args directly overcoming interceptor, it will just unmap some region inside original larger mapping with red-zones, so red-zones will stay, and since they are mapped with PROT_NONE, will catch later incorrect read/write access to them. The only problem here is that red-zone mapping are leaking if unmap is called directly.
Wouldn't that work?
Original comment by `kirill.s...@gmail.com` on 30 Sep 2014 at 11:41
> I've measured my running firefox:
> $ cat /proc/`pidof iceweasel`/maps | wc -l
I think this stat is low exactly because the kernel consolidates adjacent
mappings. For example, Chrome calls mmap() 600+ times during the first seconds
of startup, and many of those calls would've been merged into a single region.
https://bugzilla.kernel.org/show_bug.cgi?id=67651 gives an idea of how the
number of mappings changes when adjacent mappings with similar permissions
aren't combined.
> So if a user creates special program which creates a lot of mappings he/she
should probably be already aware of
> /proc/sys/vm/max_map_count
> knob and _that_ is the flag to tune to run such a program with additional
mmap sanitizations.
Thanks, I wasn't aware of this. Indeed, sysctl vm.max_map_count=16777216 lets
the program create far more mappings (although this probably affects the
performance).
> Could you please provide an example?
MAP_FIXED was my first thought. I was also thinking about doing something like:
ptr = mmap(0, size, ...)
ptr2 = mmap(ptr + size, ...)
, but this isn't guaranteed to work (i.e. is one of the cases you want to look
for), so probably MAP_FIXED is the only interesting case.
> On munmap see to which region it belongs, and if original region (which could
be later changed with MAP_FIXED mapping into it, but anyway) was originally
mmaped with red-zone, and we are unmapping adjacent to red-zone (either part),
just adjust the red zone.
Sounds good
> And if kernel __NR_mmap is called with correct args directly overcoming
interceptor, it will just unmap some region inside original larger mapping with
red-zones, so red-zones will stay, and since they are mapped with PROT_NONE,
will catch later incorrect read/write access to them. The only problem here is
that red-zone mapping are leaking if unmap is called directly.
We'll also need to remove the left part of the original larger mapping from
ASan's mappings registry so that further munmap() calls do not attempt to unmap
that part's redzone (I'm talking about the right redzone, although we can
possibly create both).
I now agree mmap() interception won't break much, although may increase lookup
time in the kernel structures handling the mappings.
Original comment by ramosian.glider@gmail.com
on 30 Sep 2014 at 12:29
On Tue, Sep 30, 2014 at 12:29:08PM +0000, address-sanitizer@googlecode.com
wrote:
>
> Comment #8 on issue 348 by ramosian.glider@gmail.com: asan does not always
> detect access beyond mmaped page
> https://code.google.com/p/address-sanitizer/issues/detail?id=348
>
> >I've measured my running firefox:
> > $ cat /proc/`pidof iceweasel`/maps | wc -l
>
> I think this stat is low exactly because the kernel consolidates adjacent
> mappings. For example, Chrome calls mmap() 600+ times during the first
> seconds of startup, and many of those calls would've been merged into a
> single region.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=67651 gives an idea of how the
> number of mappings changes when adjacent mappings with similar permissions
> aren't combined.
Thanks for pointers. Yes, merging VMAs is important for performance,
because the kernel does binary tree lookup `addr -> vma` on every
pagefault and minimizing #vma minimizes that tree and lookup time.
I can't say about Chrome, but for Firefox and Chromium number of mmap
_calls_ look like this:
for Firefox:
$ mmaps() {
> grep mmap $1 |grep -v MAP_FIXED |grep -v "mmap resumed"
> }
$ strace -tt -f -o x.log iceweasel
# go to yandex.ru, google.ru, lwn.net; exit
$ mmaps x.log |wc -l
744
for Chromium:
# as root
$ strace -tt -f -o y.log chromium --user-data-dir=`pwd`/userdata
# go to yandex.ru, google.ru, lwn.net; exit
$ mmaps y.log |wc -l
3523
it looks Chromium does more mmap calls, but let's look deeper
# Chromium: N(mmap calls) distribution with child pids
$ mmaps y.log |awk '{print $1}' |sort |uniq -c | sort -nr | head -20
1115 14292
909 14297
323 14317
287 14324
131 14365
118 14346
88 14321
83 14342
72 14389
39 14349
37 14401
35 14309
29 14313
26 14327
17 14316
17 14306
14 14415
14 14398
14 14376
9 14407
A-ha, N(mmap calls) per process is not that high - judging by pid, there is 1
or 2 processes which are maybe some managers (first two pids) and others
are workers with lower N(mmap calls) count.
Anyway
N(mmap calls) is limited to ~ < 1000 per process
I think this is rather low and is of the same order as resulting mapping
number. I.e.
My point about low mappings count also stand for low mmap calls.
For completeness, here is the same distribution for Firefox:
# Firefox: N(mmaps) distribution with child pids
$ mmaps x.log |awk '{print $1}' |sort |uniq -c | sort -nr | head -20
517 18401
38 18436
31 18406
31 18403
29 18453
29 18452
11 18421
9 18425
7 18446
7 18419
6 18467
5 18424
4 18431
3 18443
3 18437
3 18428
3 18422
2 18450
2 18402
1 18427
> >So if a user creates special program which creates a lot of mappings
> >he/she should probably be already aware of
> > /proc/sys/vm/max_map_count
> >knob and _that_ is the flag to tune to run such a program with additional
> >mmap sanitizations.
>
> Thanks, I wasn't aware of this. Indeed, sysctl vm.max_map_count=16777216
> lets the program create far more mappings (although this probably affects
> the performance).
Yes, as said above N(mmaps) increases pagefault lookup time and maybe
something else.
Please also note, that if we create red-zones for "before" and "after"
region, for two such larger regions with red-zones, chance are high the
kernel will allocate them adjacent, i.e.
red-before1 region1 ref-after1 ref-before2 region2 red-after2
so adjacent {after,before} pairs will be merged into one PROT_NONE
mapping by kernel.
That means the overhead will be ~ 1 PROT_NONE mapping for real mmap
without MAP_FIXED, not 2 PROT_NONE / mmap as I've mistakenly thought before.
> >Could you please provide an example?
> MAP_FIXED was my first thought. I was also thinking about doing something
> like:
> ptr = mmap(0, size, ...)
> ptr2 = mmap(ptr + size, ...)
>
> , but this isn't guaranteed to work (i.e. is one of the cases you want to
> look for), so probably MAP_FIXED is the only interesting case.
And we do not add red-zones to MAP_FIXED for this reason.
> >On munmap see to which region it belongs, and if original region (which
> >could be later changed with MAP_FIXED mapping into it, but anyway) was
> >originally mmaped with red-zone, and we are unmapping adjacent to red-zone
> >(either part), just adjust the red zone.
> Sounds good
>
> >And if kernel __NR_mmap is called with correct args directly overcoming
> >interceptor, it will just unmap some region inside original larger mapping
> >with red-zones, so red-zones will stay, and since they are mapped with
> >PROT_NONE, will catch later incorrect read/write access to them. The only
> >problem here is that red-zone mapping are leaking if unmap is called
> >directly.
> We'll also need to remove the left part of the original larger mapping from
> ASan's mappings registry so that further munmap() calls do not attempt to
> unmap that part's redzone (I'm talking about the right redzone, although we
> can possibly create both).
>
> I now agree mmap() interception won't break much, although may increase
> lookup time in the kernel structures handling the mappings.
That was my point. And the increase, if it is 1 protective VMA per real
mapping should be imho not high.
Thanks,
Kirill
Original comment by kirill.s...@gmail.com
on 30 Sep 2014 at 3:08
I'll be glad to review a patch that adds redzones around pages mapped with
mmap(NULL, ...) under a flag. We can decide whether it's worth having it by
default once we try it on some big apps.
Original comment by ramosian.glider@gmail.com
on 30 Sep 2014 at 3:35
I see, thanks.
Original comment by kirill.s...@gmail.com
on 30 Sep 2014 at 6:40
Original issue reported on code.google.com by
kirill.s...@gmail.com
on 29 Sep 2014 at 8:24Attachments: