LunarG / gfxreconstruct

Graphics API Capture and Replay Tools for Reconstructing Graphics Application Behavior
https://vulkan.lunarg.com/doc/sdk/latest/linux/capture_tools.html
MIT License
405 stars 116 forks source link

userfaultfd async wp mode in page guard manager #1460

Open ishitatsuyuki opened 6 months ago

ishitatsuyuki commented 6 months ago

Congratulations on shipping userfaultfd support in the page guard manager!

It looks like overhead was noted as a concern when using the userfaultfd option. I would like to point out that a recently added "Async Write-Protect" mode for userfaultfd is intended to address change-tracking use cases like this.

The feature is documented at https://docs.kernel.org/admin-guide/mm/userfaultfd.html#write-protect-notifications, and is used in Valve's Wine fork (https://github.com/ValveSoftware/wine/blob/bleeding-edge/dlls/ntdll/unix/virtual.c) to emulate Win32's write watches. What gfxr is doing should be similar to write watches, so perhaps the Wine source can be used as a reference when implementing async wp support.

panos-lunarg commented 6 months ago

Hi @ishitatsuyuki

Thank you for taking the time to suggest possible improvements.

From what I understand you are referring to UFFDIO_WRITEPROTECT_MODE_WP UFFD_FEATURE_PAGEFAULT_FLAG_WP feature flag. We are aware of that but there are some limitations with it and it turns out that it's not so useful for our case.

The first reason is that this feature is relatively new and is not broadly supported by all kernels that exist now on Android phones (at least not on every phone that we have tested internally).

The second and more important reason is that it doesn't do exactly what we need. From what I understand UFFDIO_WRITEPROTECT_MODE_WP should enable to track writes even to allocated/existing pages. Unfortunately this is not enough in our case as we also want to track reads, not only writes.

I will take a look at the source file you pointed. Perhaps there is something clever there that we can also implement.

ishitatsuyuki commented 6 months ago

You are right that async WP doesn't provide any support for detecting read-only page faults.

My understanding is that async WP should eliminate the need for shadow memory and hence it will be unnecessary to track read-only faults, like how the D3D12 backend works when using GetWriteWatch. I haven't looked at the implementation yet, but please let me know in case the Vulkan backend works in some different way such that skipping tracking reads would not be viable.

panos-lunarg commented 6 months ago

To be honest we haven't tried skipping shadow memory completely with userfaultfd (or maybe we did but that was some time ago and I can't remember exactly). We have tried that with the mprotect mechanism (apply the mprotect + SIGSEGV trick directly on the mapped memory returned by the driver) but it turned out that it didn't work out well. IIRC there were random crashes. I guess it wouldn't hurt to try it with userfaultfd.

panos-lunarg commented 4 months ago

I am afraid that this is not possible. If I understand the uffd documentation correctly, both UFFDIO_REGISTER_MODE_MISSING and UFFDIO_REGISTER_MODE_WP must be applied on private anonymous memory regions. The memory regions returned from the driver are not expected to fulfill this criteria. I did some very brief tests by not allocating a shadow memory and registering the mapped memory directly to uffd and it seems that the registration fails:

E gfxrecon: ioctl/uffdio_register: Invalid argument
E gfxrecon: uffdio_register.range.start: 0x7700769000
E gfxrecon: uffdio_register.range.len: 524288

Checking the /proc/pid/maps file for this region:

7700769000-77007e9000 rw-s 10e10e000 00:10 950                           /dev/dri/renderD128
ishitatsuyuki commented 4 months ago

This is unfortunately poorly documented but any memory region is allowed if you use ASYNC WP. Could this help?

https://github.com/torvalds/linux/blob/6d69b6c12fce479fde7bc06f686212451688a102/include/linux/userfaultfd_k.h#L225

panos-lunarg commented 4 months ago

Interesting. This doesn't agree with the documentation found online. I only tried applying UFFDIO_REGISTER_MODE_MISSING that is currently used. I'll look into it a bit more

panos-lunarg commented 4 months ago

No it's still failing. I'm initializing with UFFD_FEATURE_PAGEFAULT_FLAG_WP, registering only with UFFDIO_REGISTER_MODE_WP and registrations fail with EINVAL. Tried this on both android and desktop. Maybe I am missing something? Have you ever done something similar?

Edit: It looks like this also requires UFFD_FEATURE_WP_ASYNC not just UFFD_FEATURE_PAGEFAULT_FLAG_WP. I can't find this flag even in the userfaultfd.h on ubuntu with kernel 6.5. This looks a cutting edge feature introduced in kernel 6.7 not broadly available even on desktop.

Edit 2: Note for future reference: UFFD_FEATURE_WP_ASYNC solves the problem in an asynchronous manner: The faults are not sent to the application via messages over the created fd in order to be resolved but Instead they are handled by the kernel. The user app can collect the “written/dirty” status by looking up the uffd-wp bit for the pages being interested in /proc/pagemap. This basically boils down into two things:

Possible problems: Multi threaded applications might still cause problems while GFXR is parsing /proc/pagemap to detect dirty pages and at the same time threads are touching these pages.

Since there will be no need for shadow memory it looks promising as a better and faster alternative for memory tracking on Linux and Android. But this should wait until 6.7 becomes a standard for Android kernels.