rust3ds / pthread-3ds

PThread implementation for Nintendo 3DS Horizon OS targets. Keep in mind that Horizon OS uses a cooperative, and not preemptive, threading model.
Apache License 2.0
12 stars 7 forks source link

Implement TLS Destructors #28

Closed Meziu closed 4 months ago

Meziu commented 1 year ago

Closes #19

This PR aims to completely implement TLS destructors.

WIP Issues

While the small changes I made seem to correctly yield control over to the run_dtors function in std, some internal issues compromise the tls system.

The first Key (which is always the THREAD_INFO key) has a properly set destructor which runs to completion without any problems. However, in my testing environment, the Key right after has a corrupted pointer for its destructor (and sometimes, value).

Correctly run dtor (seen via gdb): image

Incorrectly run dtor: image

After a couple of re-runs, I've noticed that the __malloc_av function (with some sort of offset) appears often as the dtor and that the ptr and dtor value are often equal, though this behaviour is most likely the result of some other problem.

It's been hard getting to understand this issue, since it appears within std's TLS system. Is there, perhaps, something wrong with our TLS var creation?


I may not be able to help with the whole toolchain in the following month, so I'm trying to put out as much info I gather as possible, even when it's not much 😅.

AzureMarker commented 1 year ago

Looks good code-wise!

Meziu commented 4 months ago

Well, there has been some changes since the last time I've checked this issue.

Running my tests on the latest nightly, using the implementation in this PR, I can say it doesn't give off an ARM panic anymore (in debug mode, at least). Instead, it fatally panics on this debug assert, where it checks for the "state" of the destructor. It seems to run the same dtor twice, since the wrong assert has a left value of 2 (which corresponds to a dtor already run).

The error log is:

thread <unnamed> panicked at .../std/src/thread/mod.rs:700:1:
assertion 'left == right' failed
left: 2
right: 1

fatal runtime error: thread local panicked on drop

And this is my testing example:

use ctru::prelude::*;

thread_local! { static FOO: Foo = Foo; }

struct Foo;

impl Drop for Foo {
    fn drop(&mut self) {
        println!("This came from a destructor!");
    }
}

fn main() {
    let apt = Apt::new().unwrap();
    let mut hid = Hid::new().unwrap();
    let gfx = Gfx::new().unwrap();
    let _top_screen = Console::new(gfx.top_screen.borrow_mut());

    println!("Running a new thread with thread-local variables...");

    std::thread::spawn(|| {
        /// FOO must be accessed for its destructor to be registered (TLS vars are lazy).
        FOO.with(|v| {
            println!(
                "FOO exists in the new thread, here's its size: {}",
                std::mem::size_of_val(v)
            )
        });
    });

    println!("\x1b[29;16HPress Start to exit");

    while apt.main_loop() {
        gfx.wait_for_vblank();

        hid.scan_input();
        if hid.keys_down().contains(KeyPad::START) {
            break;
        }
    }
}

I also seem to have found the reason why this double run seems to happen. The destructors are called at the end of the thread by pthread_3ds::thread_keys::run_local_destructors(), which (since only one destructor is ever registered) runs the function std::sys_common::thread_local_dtor::register_dtor_fallback::run_dtors().

First iteration: image

The first thing this function does is run the list of destructors which pthread_3ds passed as input (by which I mean the only TLS var registered is that list). The function runs fine the destructors, in this case calling std::thread::CURRENT::__getit::destroy. Fine the first time, what it does next is the problem.

What the function does here is running the destructors for which it has kept track of separately in the DTORS static variable. It just so happens that these are the same exact destructors we have just finished running in the first iteration, and thus everything falls apart once the loop starts again.

Second iteration: image

In this image you can see how the new set of dtors to run in the iteration start again at std::thread::CURRENT::__getit::destroy.


How to fix

I don't know(?) The thing that is the less clear to me is how DTORS and the list kept in pthread_3ds are the same. I am unsure whether this is actually supposed to be the case or not. By just jumping the next line with gdb, the program ran fine and everything was properly dropped, so I'm pretty puzzled.

I'll be researching more at a later date, but I think I'm pretty close (if you have any ideas about the inner workings of the std TLS code please tell me). I wish I could just avoid std::thread::CURRENT...

Meziu commented 4 months ago

The damned VSCode git extension pushed to master like it's nothing...