boost-ext / ut

C++20 μ(micro)/Unit Testing Framework
https://boost-ext.github.io/ut
Boost Software License 1.0
1.26k stars 120 forks source link

Version 2.0.1 segfaults when using std::format and std::cout on multiple threads #618

Open stephenberry opened 8 months ago

stephenberry commented 8 months ago

Expected Behavior

v2.0.1 causes segfaults in std::format and streaming operator (std::cout). I've also seen it segfault in the boost ut streaming.

Actual Behavior

Typically std::abort is called

Steps to Reproduce the Problem

This is occurring in a complex codebase. But, it is essentially just using std::cout << std::format from multiple threads. By simply using boost ut I get segfaults. Sometimes the segfaults are in boost ut streaming, and sometimes they are in std::format or in std::cout. This happens less often with v2.0.0, but it still happens.

Are you using any static variables in your code that should be thread_local?

Specifications

stephenberry commented 6 months ago

I am bumping this issue because it's a big deal that adding some boost ut tests can cause segfaults in perfectly good, completely unassociated code. I think boost ut isn't handling std::cout with proper thread safety. This behavior has caused my team members to question the stability of boost ut and argue that gtest would be safer. I love boost ut and would like to help solve this bug. I can work on making a simplified example of the bug, but I was curious if you had any insights into what might be the issue and what I should try to stress concerning threading and std::cout?

kris-jusiak commented 6 months ago

Hi @stephenberry.

Firstly, apologies for not talking care of this issue earlier. Trying to reproduce the issue but I'm struggling with that. Would it be possible to provide some repro case, please? or the stack trace. It's likely that streaming is not guarded properly, will keep looking. Thanks.

stephenberry commented 6 months ago

Thanks for your reply, I'll work creating a simplified example, just wanted to save some effort if you were aware of a potential issue. Will get back to you soon.

kris-jusiak commented 6 months ago

Looking at the only format usage in UT

#if defined(BOOST_UT_HAS_FORMAT)                                                
#if __cpp_lib_format >= 202207L                                                 
  template <class... Args>                                                      
  void operator()(std::format_string<Args...> fmt, Args&&... args) {                     
    on<std::string>(                                                            
        events::log{std::vformat(fmt.get(), std::make_format_args(args...))});  
  }                                                                             
#else                                                                           
  template <class... Args>                                                      
  void operator()(std::string_view fmt, Args&&... args) {                       
    on<std::string>(                                                            
        events::log{std::vformat(fmt, std::make_format_args(args...))});        
  }                                                                             
#endif                                                                          
#endif                                                                          

that's going direclty to

std::cout << msg; 

so I guess it depends on which format is used and how as well. This string_view is a bit concerning is that the path which the code is talking during the crash.

@RalphSteinhagen do you have any ideas, I think that was part of the junit reporting change?

stephenberry commented 6 months ago

We are using std::format

RalphSteinhagen commented 6 months ago

@krzysztof-jusiak and @stephenberry thanks for the ping. :+1:

For sure, <iostream>, <print>, probably also <format> and parts of UT are not inherently thread-safe and would need to be guarded and/or piped through a lock-free message queue which I haven't put in place. I just extended the bits trying to/in the hope of following the existing UT style.

In our team, we usually tell people not to use UT in a multi-threaded context since, among other things, the static instantiations of some of the features are inherently not thread-safe.

For comparison, most other frameworks are absolutely thread-unsafe and hard to debug in part due to their massive use of macros.

However, this does not mean UT cannot be made thread-safe(r).

I guess, the first steps could be:

stephenberry commented 6 months ago

Thanks for the additional details. It would be good to at least document in the README that ut is not thread safe.

I was able to build a very simple example of the problem. The code below works fine without ut, but by simply adding ut then we get segfaults. This means that pretty much no multi-threading with output can be used in a program to be tested by ut.

#include "boost/ut.hpp"
#include <iostream>
#include <format>
#include <vector>

#include <thread>

int main(int argc, const char** argv)
{
    using namespace boost::ut;
    using namespace boost::ut::literals;

    // Optional Unit test filtering
    // https://github.com/boost-ext/ut/blob/master/example/filter.cpp
    cfg<override> = { .filter = argc == 1 ? "" : argv[1] };

    std::vector<std::jthread> threads;

    for (auto t = 0; t < 10; ++t) {
        threads.emplace_back(std::jthread([] {
            for (size_t i = 0; i < 100; ++i) {
                std::cout << std::format("{}", i) << '\n';
            }
            }));
    }

    expect(true == true);

    for (auto& t : threads) {
        t.join();
    }
}
stephenberry commented 6 months ago

I love how simple ut is, but not being able to have standard output on any other threads makes it difficult to use in projects with more than one thread, because trying to track down and remove every possible cout in all dependencies is often not feasible.

kris-jusiak commented 6 months ago

Thanks for the example and info.

Regarding making UT safe thread. Indeed there are a few things to consider such as sync output and internal details. It also depends on whether tests are going to be run in parallel or the use case is just for client code to be multi-threaded with output. The latter should be relatively simple as we can just use std::osyncstream which C++20 synchronized output stream. We can just use that instead of clog and the above example should be fine IMHO. That's an easy change we can try if that's desired? Full thread safety mode for testing is more complex due to internals, we could implement a thread-safe runner though, which would isolate (aka copy) data to each runner (let's in a thread pool) and then we can just sum in the main thread but I'm not sure, that requires more considerations for sure. For now, I think we can experiment with sync stream as I believe that should solve the problem at hand?

stephenberry commented 6 months ago

One curious thing is that if you even delete expect(true == true); so that no tests are running, simply including the boost ut header causes segfaults. This seems like a core design issue because we aren't even colliding on output.

stephenberry commented 6 months ago

I do think std::osyncstream may solve most problems, but in my comment above, I think something worse is going on than ostream synchronization.

kris-jusiak commented 6 months ago

It's likely the change which grabs argv without passing, it was causing a lot of issues. BTW some issues can be solved by using v1.1.9 if you don't need fancy stuff with syncstream that may actually do it.

stephenberry commented 6 months ago

You're right, v1.1.9 works without error. Do you think the latest version can be fixed?

kris-jusiak commented 6 months ago

Yeah, but if not I'll revert that change as there are a lot of issues with it.

stephenberry commented 6 months ago

Would you recommend moving to ut2. It doesn't look like it would suffer from this issue. Do you intend to focus more on ut2 now?

kris-jusiak commented 6 months ago

Well, it depends, let me get back to that at then of this message. Firstly, I wanted to share some thoughts, ideas for your issues.

The example code is not thread-safe so it would require osyncstream or similar, which solve the issue in this particular example but not generically as UT is not thread safe by design. But if you are printing in your code a lot the streams would have to be always synchronized between ut end the system under test which is hard to solve.

threads.emplace_back(std::jthread([] {
  for (size_t i = 0; i < 100; ++i) {
        std::osyncstream scout(std::cout);
        scout << std::format("{}", i) << '\n';
  }
}));

Some ideas how to tackle that with the current UT.

The output can be disabled whilst running tests which helps with the stream sync issues but doesn't solve other sync issues.

    std::cout.setstate(std::ios_base::failbit);
    std::clog.setstate(std::ios_base::failbit);
    std::cerr.setstate(std::ios_base::failbit);

Use processes instead of threads? UT supports custom runners, so instead of threads (unless they are needed) tests could be run in different processes which would eliminate synchronization issues but has different drawbacks.

So long story short, making output thread safe would require thread safe across ut and clients and client itself. Full thread safe would require changing UT design which is a lot of work at this stage. Processes might be an alternative to consider?

Regarding ut2. ut2 takes the best, IMHO, feature from ut and applies all the learning too, so in that case is superior. However by design it will never be feature rich as ut which is more mainstream nowadays, with junit support etc. ut2 is also has totally different execution model - it's both compile-time and run-time and it compiles 10-100x faster so there are benefits but it's more limited. Making ut2 output is even optional so there are no issues with output there but it's not thread safe either, but it's very easy to customize to make it happen with a custom cfg. All in all depends, on what is required for testing? is junit output essential as well as fancy features then ut is more suitable, otherwise ut2 is improved product over ut but it doesn't have all the features.

stephenberry commented 6 months ago

Thanks so much for your detailed thoughts. The core of my current issue it turns out has nothing to do with thread collisions on cout, but rather grabbing argv without parsing. My code wouldn't crash if this bug were fixed because we actually never call cout when boost ut would also be calling cout.

As per threading, I would be happy if I could just direct ut messages to a buffer (e.g. a std::string) instead of cout, which I could then inspect at my leisure. I don't think any synchronization is necessary. But, I do think ut and ut2 should express in their READMEs that multi-threaded synchronization is not supported.

kris-jusiak commented 6 months ago

Could you please elaborate on grabbing argv without parsing problem. I know there is an issue there but was struggling to repro it therefore fixing it. Any chance for some repro example? Can release fixed version as soon as I can repro it, fixed should be easy but I just have to understand the problem. Thanks.

Indeed, will update README.

stephenberry commented 6 months ago

The program below causes std::abort to be called. Just including the boost ut header causes the program to crash.

#include "boost/ut.hpp"
#include <iostream>
#include <format>
#include <vector>

#include <thread>

int main(int argc, const char** argv)
{
    using namespace boost::ut;
    using namespace boost::ut::literals;

    cfg<override> = { .filter = argc == 1 ? "" : argv[1] };

    std::vector<std::jthread> threads;

    for (auto t = 0; t < 10; ++t) {
        threads.emplace_back(std::jthread([] {
            for (size_t i = 0; i < 100; ++i) {
                std::cout << std::format("{}", i) << '\n';
            }
            }));
    }

    for (auto& t : threads) {
        t.join();
    }
}
stephenberry commented 6 months ago

v1.1.9 works without error. You had mentioned above:

It's likely the change which grabs argv without passing, it was causing a lot of issues. BTW some issues can be solved by using v1.1.9 if you don't need fancy stuff with syncstream that may actually do it.

kris-jusiak commented 6 months ago

That's my suspicions, but I'm struggling to repro mentioned cases. Can you share your compiler, compiler flags, os?

For me (linux, clang-17/gcc13, -O3/-O0)

stephenberry commented 6 months ago

Strange, I don't get crashes with the included code, and I thought this was safe.

https://stackoverflow.com/questions/6374264/is-cout-synchronized-thread-safe

Concurrent access to a synchronized (§27.5.3.4) standard iostream object’s formatted and unformatted input (§27.7.2.1) and output (§27.7.3.1) functions or a standard C stream by multiple threads shall not result in a data race (§1.10). [ Note: Users must still synchronize concurrent use of these objects and streams by multiple threads if they wish to avoid interleaved characters. — end note ]

According to this source C++11 guarantees no data races.

This behavior is consistent for me with GCC 13.2, 13.3, and the latest MSVC.

RalphSteinhagen commented 4 months ago

Hi @krzysztof-jusiak,

I’ve been thinking about the thread safety issue. What are your thoughts on encapsulating and redirecting error message streams via std::basic_osyncstream? This should ensure thread-safe output with minimal performance impact for most tests.

Would this be something you’d consider merging if I created a PR?

N.B. the argv issue is a different story and more related to whether unit-tests should be executed/defined as static entities or always explicitly invoked via main(...).

kris-jusiak commented 4 months ago

Hi @RalphSteinhagen, Yes, Please! I would really appreciate it and agree that osyncstream is the way to go here. Thank you!