switchbrew / libnx

Library for Switch Homebrew
https://switchbrew.github.io/libnx/
ISC License
1.26k stars 167 forks source link

tmem: Add tmemCloseHandle(), tmemWaitForPermission(), use those functions in nv.c to fix a race condition #606

Closed masagrator closed 1 year ago

masagrator commented 1 year ago

When TransferMemory has different permission than RW, we need to check if src_addr permission was restored to original state as we currently don't have any assurance this happened when __libnx_free is called.

This is fixing an issue with Tesla overlays crashing randomly (mainly in docked mode) related to this bug.

Thanks to ~SciresM for providing an idea to solve this and ~HE from ReSwitched for help with shaping out code.

fincs commented 1 year ago

Since this is a problem caused by a race condition bug in nvservices (not libnx's fault), it seems a bad idea to special case the workaround in tmemClose. An alternative solution would be to introduce a new tmemWaitForPermission function that would wait for a given tmem object to match the specified permission mask/value (as parameters); and use that in nv.c just before calling tmemClose.

masagrator commented 1 year ago

This cannot be done before tmemUnmap is used, otherwise it will stuck in infinite loop. Confirmed via tests.

My proposition is introducing tmemWaitForPermissionAndClose that would do what currently my PR is doing, but as separate function.

fincs commented 1 year ago

nvservices' tmem object is never mapped into the user process - tmemMap is never called; in fact the tmem is created as Perm_None, so it is not possible to map it into the user process at all.

masagrator commented 1 year ago

Then maybe it's a handle that must be closed first? If not, then I have no idea why it's not working.

My fork I was testing https://github.com/switchbrew/libnx/compare/master...masagrator:libnx:patch-2

yellows8 commented 1 year ago

Impl the tmemWaitForPermissionAndClose you mentioned.

Then maybe it's a handle that must be closed first? If not, then I have no idea why it's not working. <- The handle must be closed for that to do anything.

yellows8 commented 1 year ago

Actually you can keep tmemWaitForPermission if tmemCloseHandle is added. tmemCloseHandle would have the handle-closing code from tmemClose, and you'd call this following the _nvCmdInitialize call.

masagrator commented 1 year ago

Checked this in overlays with 2 people and it works fine. I guess this will need grammar fixes as I'm not great at this.

yellows8 commented 1 year ago

Replace the duplicate code in tmemClose with a call to tmemCloseHandle.

ghost commented 1 year ago

After this changes I started to have rare hangs on framebuferClose call. Home button works and exits to homescren.

masagrator commented 1 year ago

If such hang happens, you should attach GDB if it would allow it and check if it's not stuck in loop. For example in tmemWaitForPermission(). But if this is true, then it would mean that without this function your app would crash with 0x2A8 error code.

ghost commented 1 year ago

I checked as you say - it do wait in the tmemWaitForPermission(). For test I commented out that call and now have neither hang, nor crash. Am I right that rw permission needs for free() call?

masagrator commented 1 year ago

I'm not sure if only RW or RW and R since this is something platform specific.

You can try to write Perm_R instead of Perm_Rw and see what will happen, it should catch any Read permission.