rust3ds / pthread-3ds

PThread implementation for Nintendo 3DS Horizon OS targets. Keep in mind that Horizon OS uses a cooperative, and not preemptive, threading model.
Apache License 2.0
12 stars 7 forks source link

Update to upstream pthread #27

Closed Meziu closed 2 weeks ago

Meziu commented 1 year ago

Closes #26

This is an update to base our implementation on top of the upstread pthread that has been recently added. Most calls have been either removed (if they were fully re-implemented by newlib, like those for Mutexes and Condvars) or have been morphed into "syscalls" (as they call them), which are downstream implementations of the pthread functions.

There are a few things to note here:

The last part is absolutely horrendous, since it would mean a BIG step back on our progress to "normalize" priority and affinity. However, we can't ignore the massive importance of these new impls, and that probably libctru itself will one day support them directly.

As such, I made this PR to show you what a "new" pthread-3ds may look like, and to discuss its future.

I'll start the discussion on 2 questions:

  1. If we can set the priority when a thread is already running, maybe we don't need to pass the priority during creation?
  2. Since, to support cpu affinity (especially on the sys_core), we would need a custom thread creation process regardless, can we simply avoid passing the special arguments to pthread-3ds and instead use a different solution? (something like a temporary thread local variable which we can set from ctru-rs and read in pthread-3ds).

I know these solutions are pretty ugly and unstable, but I honestly don't think we can stop the changes made in newlib or libctru simply because "we are against that". There is progress being made on those fronts and perhaps we should just adapt. However, we should try to PR some changes regarding to “sched_yield” and “thread_create”, because as it stands we just lose functionality.

Edit: This PR has been thoroughly tested with std::thread::spawn and std::sync::Mutex, so I am 100% certain it works. However, it requires the latest dependencies from dkARM and this ctru-sys.

ian-h-chamberlain commented 1 year ago

The new __syscall_thread_create function CANNOT receive priority or cpu affinity as its parameters, since it doesn't receive a pthread_attr_t as argument (very infurianting)

Ugh, this is really unfortunate, especially since the pthread_create etc. are not weakly linked so we can't even override those implementations. I wonder if we can set priority / affinity just after thread creation instead? One of the comments on the new proposal I opened suggested an API like that on the Builder, so maybe that would be a good direction for us to go, if we can't upstream changes to newlib?

I am glad we will have less to maintain here in favor of the "canonical" upstream implementation, but the loss of some functionality is definitely unfortunate. I think we'd also need to implement stuff like pthread_setschedparam (as opposed to pthread_attr_*) pthread_setaffinity_np for the above to work as well.

Edit: closed PR by mistake

Meziu commented 1 year ago

I think we'd also need to implement stuff like pthread_setschedparam (as opposed to pthread_attr_*) pthread_setaffinity_np for the above to work as well.

  1. Is there any way to test if we can actually change the priority during runtime? We already have an implementation of pthread_setschedparam, but I remember that the OS specifics (around svcSetThreadPriority) are still a bit undocumented.
  2. The same goes for svcSetThreadIdealProcessor: does it do anything?

If we are sure we're able to set priority and ideal processor after thread creation, it wouldn't hurt as much to just leave to set afterwards, like most Thread implementations do.

I could try to run some tests, but I'm very low on time.

ian-h-chamberlain commented 1 year ago

Is there any way to test if we can actually change the priority during runtime? We already have an implementation of pthread_setschedparam, but I remember that the OS specifics (around svcSetThreadPriority) are still a bit undocumented. ... I could try to run some tests, but I'm very low on time.

I can try to do some testing today, and see what all the various functions actually do. My guess is that those functions wouldn't exist if they didn't do something, but it's also possible they are just backports or something from Switch (or they only work in kernelspace and not userspace, require privileged app etc).

ian-h-chamberlain commented 1 year ago

Ok, got some preliminary results based on the thread-info example we already had. It looks like (on my New 3DS XL anyway) the affinity and processor ID calls are not supported, but priority setting is:

test app setting all the thread properties ``` FIRM version: 11.16.0-49U main thread ID: 0x16df main thread running on processor: 0 main thread ideal processor: 0 ctru_sys::svcSetThreadIdealProcessor error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented main thread running on processor: 0 main thread ideal processor: 0 main thread affinity: 0b0001 ctru_sys::svcSetThreadAffinityMask error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented main thread affinity: 0b0001 main thread priority: 0x30 ctru_sys::svcSetThreadPriority success main thread priority: 0x2f ctru_sys::svcSetThreadPriority success Rust thread ID: 0x16e4 Rust thread running on processor: 0 Rust thread ideal processor: 0 ctru_sys::svcSetThreadIdealProcessor error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented Rust thread running on processor: 0 Rust thread ideal processor: 0 Rust thread affinity: 0b0001 ctru_sys::svcSetThreadAffinityMask error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented Rust thread affinity: 0b0001 Rust thread priority: 0x30 ctru_sys::svcSetThreadPriority success Rust thread priority: 0x2f Rust thread exited libctru thread ID: 0x16e5 libctru thread running on processor: 0 libctru thread ideal processor: -1 ctru_sys::svcSetThreadIdealProcessor error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented libctru thread running on processor: 0 libctru thread ideal processor: -1 libctru thread affinity: 0b1111 ctru_sys::svcSetThreadAffinityMask error: libctru result code 0xF8C007F4: [fatal kernel] not_supported: not_implemented libctru thread affinity: 0b1111 libctru thread priority: 0x2b ctru_sys::svcSetThreadPriority success libctru thread priority: 0x2f libctru thread exited ```

(I spawned the libctru thread using ctru_sys::threadCreate just to show that its affinity / ideal processor could be set).

I found some references indicating that this has probably been stubbed out for a long time in the kernel: https://www.3dbrew.org/wiki/SVC#System_calls https://www.3dbrew.org/wiki/8.0.0-18

So, this leaves us in a tough position. I guess the best way would probably be to ask for some kind of parameter support in the __syscall_thread_create upstream (maybe by passing _attr directly instead of separate stack size?). This would most likely invalidate the switchbrew code that seems to be the only user of this syscall https://github.com/switchbrew/libnx/blob/master/nx/source/runtime/newlib.c#L172 so getting that support might be tricky.

Edit: actually, since switchbrew is devkitA64 maybe it doesn't matter as much... I believe the patch may have originated from there but devkitARM could in theory have different signatures for pthread support. Still, if they are using the same sources it might become a pain to maintain different version (although some well placed #defines might solve the issue).

As far as the rust-lang thread proposal is concerned... we could probably try to continue down the road of a callback-type API to set stuff just after thread creation (instead of in the thread syscall itself), but obviously that would only help us with priority. I'm also not sure how well this type of stuff translates to other OSes, but maybe the callback type API is more flexible for full blown operating systems like Linux etc.

Meziu commented 1 year ago

Yeah, after some rudimentary tests I got to the same conclusions. Priority can and should be handled by std. The callback model seems to work fine so that's not a concern.

About the processorId: the first thing we should do is try to contact (via an issue or whatever) dkArm. If we manage to change the syscall signature we are back on having our original functionality. Otherwise, we should look in ways to circumvent the walls via ctru-rs, but that's our last solution as it would involve some pretty ugly and unstable code.

Regardless, I wasn't able to find the changes to pthread in their devkitARM branch, did they publish them anywhere?

ian-h-chamberlain commented 1 year ago

Regardless, I wasn't able to find the changes to pthread in their devkitARM branch, did they publish them anywhere?

I have been looking into that too, the only place I can find existence of it is https://github.com/devkitPro/buildscripts/blob/cd5e224d6a14f7d32712ab10cfc08e0c6a2daea3/dkarm-eabi/patches/newlib-4.3.0.20230120.patch so far, but perhaps the intention is deliver the same patch to https://github.com/devkitPro/newlib/tree/devkitARM eventually (I see an older commit from when they updated to 4.2.0, and the patch to buildscripts is newer than any commit on the newlib/devkitARM branch).

I think probably what we should do is open an issue against https://github.com/devkitPro/newlib, asking what the general status / expectations of pthreads on 3DS/devkitARM are, without asking for anything just yet. Depending on how that goes we can try to figure out the best path forward, but I don't want to rush into asking for features from the devkitPro team especially since we are trying to do some pretty nonstandard / unsupported stuff as far as they are concerned.

Meziu commented 1 year ago

I think probably what we should do is open an issue against https://github.com/devkitPro/newlib, asking what the general status / expectations of pthreads on 3DS/devkitARM are, without asking for anything just yet. Depending on how that goes we can try to figure out the best path forward, but I don't want to rush into asking for features from the devkitPro team especially since we are trying to do some pretty nonstandard / unsupported stuff as far as they are concerned.

Can I leave the diplomacy to you? You seem pretty experienced on this side of things, lol.

ian-h-chamberlain commented 1 year ago

Can I leave the diplomacy to you? You seem pretty experienced on this side of things, lol.

Haha, sure. I don't know that I'm so experienced, just know they have discouraged people from trying to use unsupported languages etc. in the past. I'll try to write something up in the next few days.

Meziu commented 1 year ago

@ian-h-chamberlain Well, it seems like this issue got where we predicted. I've just noticed that @fincs has officially started supporting the new signature in libctru via this commit https://github.com/devkitPro/libctru/commit/e9fa000e941398838490ebeda032b99a23d23c47. It hasn't made its way into a release yet, but it's only a matter of days.

In their implementation the cpu-id issue got "ignored" (obviously, there isn't much to do about it), and instead they just default to spawning threads on the main core. Those changes also invalidate the need for even more functions which are still present in this PR, so my guess is that this repo as a whole will end up exactly like shim-3ds.

From one point of view it's great, we need to maintain less code thanks to the advancements in the upstream toolchain, which is exactly what I wanted when I wrote this crate. However, unlike shim-3ds, where we basically got a free pass on the whole crate, I feel like pthread wasn't handled in the best way possible. I immensely respect the effort put into the whole project, but I believe the merge between dkA64 and dkARM made the support for the 3DS a tiny bit less precise. I just wished there was a way to pass pthread_attr to the underlying implementation (in a standard way).

At the end of the day, I can't really complain. Our best bet for now is to just adapt and stabilize our tools as much as possible. I believe we should port what remains of this repo over to ctru-rs and maybe work on system-specific functionality in there as well.

AzureMarker commented 1 year ago

Maybe I missed a thread, but I don't see much follow-up from our side besides this issue: https://github.com/devkitPro/newlib/issues/26

We can reply again and ask for this feature request to be considered, and/or open an PR ourselves.

Meziu commented 1 year ago

We can reply again and ask for this feature request to be considered, and/or open an PR ourselves.

I was thinking about re-discussing the issue in that thread/possible PRs and to generally work on it, but there is some complexity regarding the cross-support with Nintendo Switch code (which doesn't need our changes but would get impacted by them) and the newly added implementations in libctru which is making me desist from getting involved. It's still a very dynamic situation and I'm personally prioritizing working on our packages while I can (stuff such as the gazillion lines of documentation I'm currently writing for ctru-rs after the cargo-3ds release, in the hopes of closing https://github.com/rust3ds/ctru-rs/issues/32).

fincs commented 1 year ago

Fwiw, the current impl in libctru is still experimental/unfinished, do not expect it to ship in the near future. As mentioned previously, specifying custom thread priorities/affinity masks is currently not supported. The 3DS's version of the Horizon kernel has some limitations that are not a good match for the pthreads interface, and do not play nice with typical pthread-using code that expects a mainstream Unix-like kernel. For example:

With this said, we are nonetheless open to improving the configurability of standard threading APIs in order to better exploit Horizon's threading system and facilitate compatibility with ported software. Any suggestions or proposals are of course welcome.

ian-h-chamberlain commented 10 months ago

As an update, it looks like an initial implementation has landed in libctru: https://github.com/devkitPro/libctru/releases/tag/v2.3.0

It might be a good time to revive this work — we could probably eliminate even more of this library now that the syscalls are implemented upstream in https://github.com/devkitPro/libctru/blob/master/libctru/source/system/syscalls.c

Meziu commented 10 months ago

Alright, it happened. I will look into the changes once I get a break from university exams. Thanks for the heads up @ian-h-chamberlain.

Meziu commented 2 weeks ago

After looking into it a bit more, I've chosen not to merge the changes in the current state, since we are able to avoid using libctru's definitions (which still lack functionality compared to our bindings).

If we ever need to use them in the future, we should work on those upstream C definitions to regain compatibility, rather than change stuff here.