etr / libhttpserver

C++ library for creating an embedded Rest HTTP server (and more)
GNU Lesser General Public License v2.1
882 stars 184 forks source link

[BUG] Data races #329

Open FlorianChevassu opened 1 year ago

FlorianChevassu commented 1 year ago

Prerequisites

Description

The documentation states that All functions are guaranteed to be completely reentrant and thread-safe (unless differently specified). However, the webserver implementation does not use any protection while reading/modifying its member variables (registered_resources, registered_resources_str, bans and allowances).

Steps to Reproduce

Here is a small test that show the issue when compiled using clang-16 with thread sanitizing enabled:


LT_BEGIN_AUTO_TEST(basic_suite, thread_safety)
    simple_resource resource;

    std::atomic_bool done = false;
    auto register_thread = std::thread([&]() {
        int i = 0;
        using namespace std::chrono;
        while (!done) {
            ws->register_resource(
                    std::string("/route") + std::to_string(++i), &resource);
        }
    });

    auto get_thread = std::thread([&](){
        while (!done) {
            CURL *curl = curl_easy_init();
            std::string s;
            std::string url = "localhost:" PORT_STRING "/route" + std::to_string(
                                            (int)((rand() * 10000000.0) / RAND_MAX));
            curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
            curl_easy_setopt(curl, CURLOPT_HTTPGET, 1L);
            curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writefunc);
            curl_easy_setopt(curl, CURLOPT_WRITEDATA, &s);
            curl_easy_perform(curl);
            curl_easy_cleanup(curl);
        }
    });

    using namespace std::chrono_literals;
    std::this_thread::sleep_for(10s);
    done = true;
    if (register_thread.joinable()) {
        register_thread.join();
    }
    if (get_thread.joinable()) {
        get_thread.join();
    }
    LT_CHECK_EQ(1, 1);
LT_END_AUTO_TEST(thread_safety)

Expected behavior:

No data races are detected by clang.

Actual behavior: Clang reports the following data race (and others):

Running test (1): thread_safety
==================
WARNING: ThreadSanitizer: data race (pid=21795)
  Read of size 8 at 0x7b1400003550 by thread T1:
    #0 memcmp /clang-16.0.1/projects/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors.inc:939:3 (lt-basic+0x67439)
    #1 memcmp /clang-16.0.1/projects/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors.inc:935:1 (lt-basic+0x67439)
    #2 httpserver::webserver::finalize_answer(MHD_Connection*, httpserver::details::modded_request*, char const*) <null> (libhttpserver.so.0+0x227b2)
    #3 MHD_connection_handle_idle <null> (libmicrohttpd.so.12+0xd1ab) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)
    #4 call_handlers daemon.c (libmicrohttpd.so.12+0x118af) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)
    #5 MHD_epoll daemon.c (libmicrohttpd.so.12+0x194ad) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)
    #6 MHD_polling_thread daemon.c (libmicrohttpd.so.12+0x1a02e) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)
    #7 named_thread_starter mhd_threads.c (libmicrohttpd.so.12+0x24515) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)

  Previous write of size 8 at 0x7b1400003550 by thread T2:
    #0 operator new(unsigned long) /clang-16.0.1/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp:64:3 (lt-basic+0xea6e5)
    #1 std::pair<std::_Rb_tree_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, httpserver::http_resource*>>, bool> std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, httpserver::http_resource*>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, httpserver::http_resource*>>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, httpserver::http_resource*>>>::_M_emplace_unique<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, httpserver::http_resource*>>(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, httpserver::http_resource*>&&) <null> (libhttpserver.so.0+0x2669a)

  Location is heap block of size 72 at 0x7b1400003520 allocated by thread T2:
    #0 operator new(unsigned long) /clang-16.0.1/projects/compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp:64:3 (lt-basic+0xea6e5)
    #1 std::pair<std::_Rb_tree_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, httpserver::http_resource*>>, bool> std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, httpserver::http_resource*>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, httpserver::http_resource*>>, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, httpserver::http_resource*>>>::_M_emplace_unique<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, httpserver::http_resource*>>(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, httpserver::http_resource*>&&) <null> (libhttpserver.so.0+0x2669a)

  Thread T1 'MHD-single' (tid=21814, running) created by main thread at:
    #0 pthread_create /clang-16.0.1/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1048:3 (lt-basic+0x2db2f)
    #1 MHD_create_thread_ <null> (libmicrohttpd.so.12+0x2443d) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)
    #2 MHD_create_named_thread_ <null> (libmicrohttpd.so.12+0x24602) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)
    #3 MHD_start_daemon_va <null> (libmicrohttpd.so.12+0x1e4de) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)
    #4 MHD_start_daemon <null> (libmicrohttpd.so.12+0x1a1d6) (BuildId: 72677d816e65dce550957833f9aea14ac2e0e4c8)
    #5 httpserver::webserver::start(bool) <null> (libhttpserver.so.0+0x20090)

  Thread T2 (tid=21815, running) created by main thread at:
    #0 pthread_create /clang-16.0.1/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1048:3 (lt-basic+0x2db2f)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State>>, void (*)()) <null> (libstdc++.so.6+0xed8f9) (BuildId: 06e9553aa6e15b32e77410de83bbdd7d208a620d)

SUMMARY: ThreadSanitizer: data race (/home/florian/work/libhttpserver/build/src/.libs/libhttpserver.so.0+0x227b2) in httpserver::webserver::finalize_answer(MHD_Connection*, httpserver::details::modded_request*, char const*)
==================
- Time spent during "thread_safety": 10702.4 ms
==================

Reproduces how often: 100%

Versions

etr commented 1 year ago

I am trying to understand the effect of a data race in this case before we add locks, which I am actively trying to keep the main response flow free from.

In both the case of registered resources and ban/unban, the only thing we care about is presence/absence of an entry (not the order of insertion). Isn't thus the worst-case scenario that someone registers a resource and for some time the resource not being registered (or viceversa)? (same for ban/unban). I wonder if all we need to do is to document eventual consistency.

I am much more worried about race conditions instead which can happen if multiple threads are continually registering/unregistering (writing) the same resource (or banning/unbanning the same IP). Kinda insane to do, but I guess it is a legitimate edge case. In this case, I think we should lock on writes so that they don't clash with other writes. Alternatively, I wonder if we should just mark those two specific methods as not thread safe in the documentation.

FlorianChevassu commented 1 year ago

The underlying std::map implementation is usually based on red-black trees. This means that an insertion might trigger a re-balancing operation that will modify the internal structure of the tree. Iterating over the map while it is being re-balanced is UB, and might lead to weird issues, not just the "presence/absence" of an entry.

Updating the documentation to clearly state that these methods are not thread safe would implies that the user should be able to have a way to execute those on the same thread that execute the finalize_answer method. And if I understand correctly, depending on the thread model used, it may be executed on different threads...

sources:

etr commented 1 year ago

The underlying std::map implementation is usually based on red-black trees. This means that an insertion might trigger a re-balancing operation that will modify the internal structure of the tree. Iterating over the map while it is being re-balanced is UB, and might lead to weird issues, not just the "presence/absence" of an entry.

That is technically incorrect.

See C++ standard guarantees that concurrent non-const accesses to a container are safe if the writes/reads are to different elements of the container. You can see this in 23.1.2/8 and 23.2.2: "The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements".

Specifically for implementation of map (as requirements for the implementation of associative collections) it requires that they don't invalidate existing elements as side effects.

Updating the documentation to clearly state that these methods are not thread safe would implies that the user should be able to have a way to execute those on the same thread that execute the finalize_answer method.

Bar the previous point. I think you are correct that there still is an issue with concurrent modification (insertion/removal) of the same element on both resources.

Because of this, I think your PR is in the right direction if we want to keep the functionalities.

There are of course legitimate use-cases for banning/allowing IPs during the execution of a webserver. I am generally less worried with locking in there as it is more isolated.

I want to investigate with folks using the library on the need of registering (and especially unregistering) resources after a webserver has started. From my limited visibility, that is a feature we might as well lose - making the registered resources structure effectively const after the webserver starts.

FlorianChevassu commented 1 year ago

See C++ standard guarantees that concurrent non-const accesses to a container are safe if the writes/reads are to different elements of the container.

Yes, concurrently modifying different elements of the container is not an issue, because as you said, iterators are not invalidated. But the fact that iterators are not invalidated only means that you can safely use/modify the value itself, not that you can safely increment it.

The standard provides an exhaustive list of containers non-const member functions that can be called concurrently without creating data races, and insert and remove are not part of this list.

etr commented 1 year ago

As said above, data races are not an issue. Race conditions are. Data races can indeed happen (as you say), but they don't matter in the use case of this library. Race conditions would be a problem, and that's what I am saying the standard prevents from happening.

In any case, this is beyond the point given that modification of the same element is still possible by the current interface, so not even worth debating further.

From my chat with a few clients, there seem to be some legitimate use cases to keep the functionality of adding/removing resources while the service execute. As such, your PR seems to be going in the right direction. I'll leave you specific comments there.