Open mato opened 5 years ago
The warning for "if a PT_LOAD PHDR requests both PF_X and PF_W." was because of IncludeOS which creates a single XWR segment (at least when using the solo5 platform). Don't know if that is still happening. In any case, I think the IncludeOS guys are interested in W^X code, so this is worth fixing on that side.
@alfred-bratterud ^
@mato OpenBSD will not allow an mprotect call with both write and execute enabled, W^X has been enforced at OS level since September 2016. I initially hit this in porting efforts.
See OpenBSD Innovations W^X
@mato OpenBSD will not allow an mprotect call with both write and execute enabled, W^X has been enforced at OS level since September 2016. I initially hit this in porting efforts.
@adamsteen I know that. What I don't know is, does this subsequent
mprotect()
for a PHDR with PF_X | PF_R
set (i.e. .text
)
https://github.com/mato/solo5/blob/enforce-nox/tenders/hvt/hvt_elf.c#L215
called on the guest memory range initially set up at
https://github.com/mato/solo5/blob/enforce-nox/tenders/hvt/hvt_openbsd.c#L117
have the intended effect on the underlying EPT mapping actually used by vmm
to back that memory, i.e. disallowing PROT_WRITE
. I'm guessing it does as
otherwise the OpenBSD port would probably not work at all since the initial
mapping does not include PROT_EXEC
, but given that the FreeBSD vmm has a
separate interface for this (VM_MMAP_MEMSEG) I'm not 100% sure.
To confirm, could you build the branch at
https://github.com/mato/solo5/tree/enforce-nox, manually run the test_nox
case, and post the output? While you're at it, can you confirm that all the
other new tests there pass?
@mato output from test_nox run manually.
doas ./solo5-hvt test_nox.hvt
| ___|
__| _ \ | _ \ __ \
\__ \ ( | | ( | ) |
____/\___/ _|\___/____/
Solo5: Memory map: 512 MB addressable:
Solo5: reserved @ (0x0 - 0xfffff)
Solo5: text @ (0x100000 - 0x104fff)
Solo5: rodata @ (0x105000 - 0x105fff)
Solo5: data @ (0x106000 - 0x10afff)
Solo5: heap >= 0x10b000 < stack < 0x20000000
**** Solo5 standalone test_nox ****
Solo5: solo5_exit(1) called
All the other tests succeed when run with doas tests/run-tests.sh
@adamsteen:
Thanks. So it looks like vmm is not updating EPT to match the prot
from mprotect()
. You know the OpenBSD vmm folks, could you please ping them and and ask how to get this behaviour (or equivalent)? Pointing them to https://github.com/Solo5/solo5/issues/303#issuecomment-446503933 or this issue in general should be enough for them to understand what we need.
I've been experimenting on FreeBSD vmm and have yet to figure out how to get the intended behaviour with VM_MMAP_MEMSEG
. It looks like there is no straightforward solution, will have to ask some of the FreeBSD vmm folks as well (cc @hannesm, who/where? freebsd-virtualization@
? or, if you have a direct contact please let me know privately).
All the other tests succeed when run with doas tests/run-tests.sh
That's odd, I would have expected test_ssp
to fail, since #294 does not enable it on OpenBSD toolchains. What does that say when run manually?
Edit: Updated description with summary of status across back-ends and more information.
@mato yeah test_ssp fails with an abort.
| ___|
__| _ \ | _ \ __ \
\__ \ ( | | ( | ) |
____/\___/ _|\___/____/
Solo5: Memory map: 512 MB addressable:
Solo5: reserved @ (0x0 - 0xfffff)
Solo5: text @ (0x100000 - 0x104fff)
Solo5: rodata @ (0x105000 - 0x105fff)
Solo5: data @ (0x106000 - 0x10afff)
Solo5: heap >= 0x10b000 < stack < 0x20000000
**** Solo5 standalone test_ssp ****
1234567890123456789012345678901234567890
Solo5: trap: type=#UD ec=0x0 rip=0x10000a rsp=0x1fffffe0 rflags=0x10006
Solo5: ABORT: cpu_x86_64.c:171: Fatal trap
Regarding the OpenBSD vmm(4) part, I'll comment that there are actually 2 separate views of the VM's memory. One is maintained by the userland part (vmd(8) in base OpenBSD or whatever you're using in Solo5 in userland to setup and launch the VM). The userland view is what receives mmap/mprotect updates. The maps involved are shared internally by a UVM API called uvm_share which handles mapping the same pages into the EPT view as well as the userland part.
Digging around in the depths of UVM isn't for the faint of heart; what I proposed on the OpenBSD misc@ mailing list to @adamsteen was to implement a new API that does an mprotect on the EPT side via an ioctl to vmm(4). Would that work? We could make the API flexible (and likely make it just look like mprotect to some degree). It would be two calls though, one "regular" mprotect and one "ept-mprotect", so there would be a short time when the maps might be out of sync.
Thoughts?
@mato a note on OpenBSD and test_ssp, OpenBSD switched to using lld (LLVM/Clang 6.0.0) as of 2018-10-22 so its in the same boat as FreeBSD with regard to -mstack-protector-guard=global
@mlarkin2015:
whatever you're using in Solo5 in userland to setup and launch the VM
Just FYI, we don't use vmd(8), the userland part is the "solo5-hvt" tender which @adamsteen ported to use the OpenBSD vmm APIs directly. However, we (well, I, when reviewing) just assumed that the userland mprotect() was sufficient to update the EPT mappings.
what I proposed on the OpenBSD misc@ mailing list to @adamsteen was to implement a new API that does an mprotect on the EPT side via an ioctl to vmm(4). Would that work? We could make the API flexible (and likely make it just look like mprotect to some degree).
That would work for us. Having it as a separate ioctl means we can test for its presence at run time, and print a warning but still run on older systems without this feature until some time after an OpenBSD with it has been released. I'll also try and coordinate something similar/figure out the situation for our FreeBSD/vmm backend.
Regarding flexibility, a useful feature to expose in an "ept-mprotect" would be the x86 ability to have execute-only EPT mappings, if possible. This would allow us at some future point to enforce that guest code is execute-only, which is interesting from the PoV of improving defences against ROP attacks.
It would be two calls though, one "regular" mprotect and one "ept-mprotect", so there would be a short time when the maps might be out of sync.
This is not an issue for us, Solo5 imposes a "static" memory layout so we only need to set up the mappings once at initialisation time and never touch them again.
@adamsteen Moving the SSP discussion to #293 which I've re-opened.
@hannesm I've started a thread on freebsd-virtualization@
about this, archives here: https://marc.info/?l=freebsd-virtualization&m=154470008622609&w=2
With Mike Larkin's description here , i should be able to implement this over the break, i will report back when its done
@adamsteen Thanks, keep me up to date on how it goes. For the purposes of testing, you can use the test_nox at https://github.com/mato/solo5/tree/enforce-nox, and temporarily hack in the call to your vmm ioctl directly in hvt_elf_load()
.
In the mean time, I will address the other sub-points in this issue, and also work out some scaffolding to be able to call a hvt_mprotect()
or similar from the ELF loader.
@mato I will continue to investigate this issue #303 and #293 as i have time, life is very busy at the moment, but i should figure it all out in the end.
@mato looking back into this, your enforce-nox
branch (including the test) disappeared from mato/solo5 -- would be great to have the test in solo5 master imho (and maybe mark as failing on FreeBSD/OpenBSD atm)
This would be greatly appreciated, I have a patch half done in the works.
looking back here, I'm wondering why hvt_init
allocates (&mmaps) a single memory segment (with RWX protection)? an alternative I can think of (after talking wth some FreeBSD folks) would be to have for each ELF segment one vmm memory segment with the corresponding protection bits (the VM_MMAP_MEMSEG
ioctl / int vm_mmap_memseg
get prot
as argument). This would require some coordination (a function provided by the platform-specific code) used in hvt_elf_load
.
@hannesm Sorry about the dissapearance of that branch. Take a look at enforce-nox-v2 which I just pushed, this only includes the tests but does not hook them up yet.
Interestingly, things are a bit more subtle on KVM than I thought, on this branch I added test_wnox
that complements test_xnow
in the other direction, i.e. tests that .data
is not executable, and it turns out that KVM on my system is not enforcing this. So some more investigation is needed here.
looking back here, I'm wondering why hvt_init allocates (&mmaps) a single memory segment (with RWX protection)? an alternative I can think of (after talking wth some FreeBSD folks) would be to have for each ELF segment one vmm memory segment with the corresponding protection bits (the VM_MMAP_MEMSEG ioctl / int vm_mmap_memseg get prot as argument).
The reason there's a single segment is that when I originally wrote the FreeBSD implementation I assumed things work the same way as under KVM, i.e. MEMSEG is the equivalent of a KVM memslot, and host-side mprotect()
would be used to enforce the additional protections. At some later point I did try using multiple MEMSEGs but could not get it to work -- I've forgotten the details now, but it seemed that MEMSEGs were not intended for "fine-grained" protection boundaries such as this.
Regarding coordination with the ELF loader, my rough plan was to do something like this:
typedef void (*hvt_guest_mprotect_fn_t)(void *addr, size_t size, int prot);
void hvt_elf_load( const char *file, uint8_t *mem, size_t mem_size,
hvt_guest_mprotect_fn_t guest_mprotect_fn,
hvt_gpa_t *p_entry, hvt_gpa_t *p_end);
... and then have the OS-specific backend implement a hvt_guest_mprotect(...)
, pass it to hvt_elf_load()
when called and replace/augment the mprotect()
call in the loader with a call to the backend-specific function passed in by the caller.
@adamsteen Would the above work for you?
@mato Yeah that should work for me!
@hannesm I've merged the W^X tests (and enabled some combinations) in #363.
Regarding the KVM behaviour, it turns out that KVM does not support marking pages as NX from the host via mprotect(), so W^X is also only "partially" possible there. (Source: mailing list thread).
@mato with this commit, OpenBSD can support this correctly and completely.
This will be available in the next release of OpenBSD 6.7, which has just gone into beta, so in about a month.
ps OpenBSD from 6.6 has a really nice update feature, sysupgrade(8), making updating releases as simple as doas sysupgrade
@mato Looks like we can remove the OpenBSD
label from this one now, after #447 being merged.
Done. (Note to self: Update the status on this issue to make it clear where we're at with the various hvt/host combinations)
As part of general hardening, we should enforce both in the ELF loader and at the guest-physical to host-virtual translation layer that any executable code in the guest is not also writable (W^X).
Summary of status for hvt back-ends:
There are several sub-issues to this:
hvt_elf_load()
currently warns if aPT_LOAD
PHDR requests bothPF_X
andPF_W
. This should be changed to a fatal error instead.hvt_elf_load()
should enforce thatPT_LOAD
PHDRs cannot overlap, otherwise an attacker could construct a malicious ELF defeating the logic in the previous point. While overlappingPT_LOAD
PHDRs are not explicitly mentioned in the ELF specification, there is no reason to support this behaviour. This check can be further simplified by enforcing thatPT_LOAD
PHDRs must be sorted byp_vaddr
in the ELF file, which is within the specification.hvt_elf_load()
currently appliesPF_x
flags from aPT_LOAD
PHDR by callingmprotect()
on the (host-side) guest memory region. This results in the correct behaviour (enforcement) only on Linux/KVM which updates the guest-physical to host-virtual translation (i.e. EPT on x86_64) as expected. A solution will need to be found to get the same behaviour on FreeBSD/vmm and OpenBSD/vmm..text
as seen by the guest is not writable.