Clemapfel / jluna

Julia Wrapper for C++ with Focus on Safety, Elegance, and Ease of Use
https://clemens-cords.com/jluna
MIT License
239 stars 12 forks source link

Segfault when original Task object goes out of scope #28

Closed paulerikf closed 1 year ago

paulerikf commented 1 year ago

Hey Clem, sorry to bother you with another issue. Not sure if I'm missing something basic here, but struggling to get multithreading working.

Quick stats before getting into things:

Ok, so I've been working my way through the multithreading docs, and things were great until I started trying to manage Task lifetimes.

Tasks seem to terminate the moment the original task object goes out of scope, even when it's added to an std::vector which is still in scope. If I create a task with auto task = ThreadPool::create(f) then add task to a std::vector, the task terminates when task goes out of scope, despite the vector still being in scope. Even more immediate, when I directly add the task to the std::vector with tasks.push_back(ThreadPool::create(f)) the program segfaults the moment I attempt tasks.back().schedule().

Not sure if that explanation made any sense, here are a few examples to hopefully illustrate what I'm seeing.

Basic example

using namespace jluna;

int main() {
    initialize(8);

    std::function<void()> f = []() -> void {
        std::cout << "This is Thread: " << ThreadPool::thread_id << std::endl;
    };

    // naive spawn task (this works fine)
    auto task = ThreadPool::create(f);
    task.schedule();
    task.join();

    // vector for managing task lifetimes
    std::vector<Task<void>> tasks;

    // spawn task into vector
    tasks.push_back(ThreadPool::create(f));
    tasks.back().schedule();
    tasks.back().join();
}
[JULIA][LOG] initialization successful (8 thread(s)).
This is Thread: 1

signal (11): Segmentation fault
in expression starting at none:0
unknown function (ip: 0x7fd2e8479e40)
operator() at /home/frivold/Code/jluna/install/include/jluna/.src/multi_threading.inl:252
__invoke_impl<_jl_value_t*, jluna::ThreadPool::create<>(const std::function<void()>&)::<lambda()>&> at /usr/include/c++/11/bits/invoke.h:61
__invoke_r<_jl_value_t*, jluna::ThreadPool::create<>(const std::function<void()>&)::<lambda()>&> at /usr/include/c++/11/bits/invoke.h:114
_M_invoke at /usr/include/c++/11/bits/std_function.h:291
_ZNKSt8functionIFP11_jl_value_tvEEclEv at /home/frivold/Code/jluna/install/libjluna.so.0.9.1 (unknown line)
jluna_invoke_from_task at /home/frivold/Code/jluna/install/libjluna.so.0.9.1 (unknown line)
#3 at ./none:813
unknown function (ip: 0x7fd2eb6ef89f)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:877
Allocations: 1626159 (Pool: 1625207; Big: 952); GC: 1

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

Example with scope block

#include <jluna.hpp>

using namespace jluna;

int main() {
    initialize(8);
    auto println_jl = Main.safe_eval("return println");
    auto sleep_jl = Main.safe_eval("return sleep");

    std::function<void()> five_mississippi = [&]() -> void {
        for (int i=1; i<=5; i++) {
            println_jl(i, " Mississippi");
            sleep_jl(1);
        }
        println_jl("Done");
    };

    // vector for managing task lifetimes
    std::vector<Task<void>> tasks;

    // scope block
    {
        auto task = ThreadPool::create(five_mississippi);
        tasks.push_back(task);
        tasks.back().schedule();
        sleep_jl(2);
    }
    tasks.back().join();
}
[JULIA][LOG] initialization successful (8 thread(s)).
1 Mississippi
2 Mississippi

signal (11): Segmentation fault
in expression starting at none:0
jl_get_nth_field at /buildworker/worker/package_linux64/build/src/datatype.c:1392
_ZNK5jluna5Proxy10ProxyValue5valueEv at /home/frivold/Code/jluna/install/libjluna.so.0.9.1 (unknown line)
safe_call<int&, char const (&)[13]> at /home/frivold/Code/jluna/install/include/jluna/.src/proxy.inl:93
operator()<int&, char const (&)[13]> at /home/frivold/Code/jluna/install/include/jluna/.src/proxy.inl:115
operator() at /home/frivold/kef_env/kef_ws/src/jluna_wrapper/src/gcc_test.cpp:12
__invoke_impl<void, main()::<lambda()>&> at /usr/include/c++/11/bits/invoke.h:61
__invoke_r<void, main()::<lambda()>&> at /usr/include/c++/11/bits/invoke.h:111
_M_invoke at /usr/include/c++/11/bits/std_function.h:291
operator() at /usr/include/c++/11/bits/std_function.h:560
operator() at /home/frivold/Code/jluna/install/include/jluna/.src/multi_threading.inl:252
__invoke_impl<_jl_value_t*, jluna::ThreadPool::create<>(const std::function<void()>&)::<lambda()>&> at /usr/include/c++/11/bits/invoke.h:61
__invoke_r<_jl_value_t*, jluna::ThreadPool::create<>(const std::function<void()>&)::<lambda()>&> at /usr/include/c++/11/bits/invoke.h:114
_M_invoke at /usr/include/c++/11/bits/std_function.h:291
_ZNKSt8functionIFP11_jl_value_tvEEclEv at /home/frivold/Code/jluna/install/libjluna.so.0.9.1 (unknown line)
jluna_invoke_from_task at /home/frivold/Code/jluna/install/libjluna.so.0.9.1 (unknown line)
#3 at ./none:813
unknown function (ip: 0x7f9540090b9f)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:877
Allocations: 1625914 (Pool: 1624963; Big: 951); GC: 1

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

Example from multithreading docs

int main() {
    using namespace jluna;
    using namespace std::chrono_literals;

    /// in main.cpp
    jluna::initialize(8);

    // task storage
    std::vector<Task<void>> tasks;

    {
        // declare lambda
        std::function<void()> print_numbers = []() -> void
        {
            for (size_t i = 0; i < 10000; ++i)
                std::cout << i << std::endl;
        };

        // add task to storage
        tasks.push_back(ThreadPool::create(print_numbers));

        // start just pushed task
        tasks.back().schedule();

        // wait for 1ms
        std::this_thread::sleep_for(1ms);
    }

    // wait for another 10ms
    std::this_thread::sleep_for(10ms);
    return 0;
}
[JULIA][LOG] initialization successful (8 thread(s)).

signal (11): Segmentation fault
in expression starting at none:0
unknown function (ip: 0x55cf03eabce0)
jluna_invoke_from_task at /home/frivold/Code/jluna/install/libjluna.so.0.9.1 (unknown line)
#3 at ./none:813
unknown function (ip: 0x7f0ac586291f)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:877
Allocations: 1626138 (Pool: 1625189; Big: 949); GC: 1
Clemapfel commented 1 year ago

This is definitely a bug as the docs example crashes as well, this may be related to #23. I will investigate an report back, I can reproduce the crash.

Clemapfel commented 1 year ago

During copy/move construction (which is triggered when a task is inserted into a vector), ThreadPool::free, which stores the function pointer for the thread, is erroneously called here. Because both the original and the newly move-constructed tasks point to the same global storage in ThreadPool, when the old task goes out of scope, it also deallocates the storage of the new task that is now in the vector. When the task attempts to execute, it segfaults because the storage is no longer there.

Fixed by this, the change is simple but I want to take my time and update the testing suite, I apologize for the fact this wasn't caught, it really should have been since the docs example explicitly calls it like this.

Thank you for bringing this up, I will try open a PR tomorrow.

paulerikf commented 1 year ago

Thanks for the quick fix @Clemapfel!

Noticed one remaining issue for the Task\<void> case: == should be !=.

Also, unrelated, but your docs are really really great. Thanks for all the work you put into this!