rollbear / trompeloeil

Header only C++14 mocking framework
Boost Software License 1.0
811 stars 86 forks source link

Segmentation fault / infinite loop in regular expression when handling fmt::is_formattable types #270

Closed reddwarf69 closed 2 years ago

reddwarf69 commented 2 years ago

So this is weird, but in

#include <fmt/format.h>
#include <trompeloeil.hpp>
#include <type_traits>

#if 1
namespace trompeloeil {

template<typename T>
struct printer<T, typename std::enable_if<fmt::is_formattable<T>::value>::type>
{
    static void print(std::ostream& os, const T& t) noexcept { os << fmt::format("{}", t); }
};

}
#endif

class mock
{
  public:
    MAKE_MOCK1(function_call, void(const std::string& text));
};

int main() {
    mock test;
    REQUIRE_CALL_V(test, function_call(trompeloeil::re("A")));
#if 0
    test.function_call("A");
#else
    test.function_call("B");
#endif
}

I get an infinite loop (optimized build) or segmentation fault with a backtrace with a loooooong recursion of trompeloeil::duck_typed_matcher<trompeloeil::lambdas::regex_check, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::operator char const*&&<char const*, void, bool>() const ( this=0x4632b8) at trompeloeil/include/trompeloeil.hpp:1151 (debug build) as long as those #if are as they are. If you remove the printer, or make the regular expression match, the problem goes away.

The problem happens with both gcc 11.2 and clang 13.

AndrewPaxie commented 2 years ago

Thanks for the report.

So far I am still setting up a working local test environment. Specifically, it appears that libfmt > 7.1.3 is required as is_formattable is not available in that version. (7.1.3 happened to be the first version I chose as it is available in my development distro, currently Ubuntu 21.10 (Impish Indri).) Will be create a local version of libfmt 8.1.1 rather than installing Ubuntu 22.04 (Jammy Jellyfish), currently in alpha.)

Is there any further variations you could provide e.g.

AndrewPaxie commented 2 years ago

I've been able to make some progress with libfmt 7.1.3 with the following stub code:

// Added to get something compiling with libfmt 7.1.3
namespace fmt {

template <typename>
struct is_formattable {
    static constexpr bool value = true;
};

} // namespace fmt

With this template class the following configurations segfault:

Also get a segfault when order of includes is reversed.

Both configurations complete with this output when the body of the print() static member function is commented out:

terminate called after throwing an instance of 'trompeloeil::expectation_violation'
  what():
No match for call of function_call with signature void(const std::string& text) with.
  param  _1 ==

Tried test.function_call(trompeloeil::re("A")) at test/issue_270.cpp:35
  Expected  _1
AndrewPaxie commented 2 years ago

G++ at head is more helpful at pointing out at compile-time what you already observed:

g++-latest / -std=c++11 / libstdc++-v3:

In file included from test/issue_270.cpp:2:
./include/trompeloeil.hpp: In member function ‘trompeloeil::duck_typed_matcher<Pred, T>::operator V&&() const [with V = const char*; <template-parameter-2-2> = void; <template-parameter-2-3> = bool; Pred = trompeloeil::lambdas::regex_check; T = {std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >}]’:
./include/trompeloeil.hpp:1151:5: warning: infinite recursion detected [-Winfinite-recursion]
 1151 |     operator V&&() const { return *this; }
      |     ^~~~~~~~
./include/trompeloeil.hpp:1151:36: note: recursive call
 1151 |     operator V&&() const { return *this; }
      |                                    ^~~~
reddwarf69 commented 2 years ago

About versions, it's easy to use the latest ones, specially if you go for header-only

$ git clone git@github.com:fmtlib/fmt.git
Cloning into 'fmt'...
remote: Enumerating objects: 28566, done.
remote: Counting objects: 100% (2368/2368), done.
remote: Compressing objects: 100% (242/242), done.
remote: Total 28566 (delta 2226), reused 2125 (delta 2103), pack-reused 26198
Receiving objects: 100% (28566/28566), 13.67 MiB | 5.15 MiB/s, done.
Resolving deltas: 100% (19317/19317), done.
$ git clone git@github.com:rollbear/trompeloeil.git
Cloning into 'trompeloeil'...
remote: Enumerating objects: 4838, done.
remote: Counting objects: 100% (606/606), done.
remote: Compressing objects: 100% (314/314), done.
remote: Total 4838 (delta 346), reused 430 (delta 202), pack-reused 4232
Receiving objects: 100% (4838/4838), 2.43 MiB | 811.00 KiB/s, done.
Resolving deltas: 100% (3280/3280), done.
$ g++ -o test test.cpp -Ifmt/include/ -Itrompeloeil/include -DFMT_HEADER_ONLY
$ ./test 
Segmentation fault (core dumped)
reddwarf69 commented 2 years ago
* Any change in behaviour if the body of `print(std::ostream&, const T&)` is commented out?

Yes, that makes it work. As does using fmt::print(os, "{}", t); (after including fmt/ostream.h) instead of os << fmt::format("{}", t);

AndrewPaxie commented 2 years ago

As an additional data point, I commented out the body of the duck_typed_matacher operator, like so:

   template <typename V,
              typename = detail::enable_if_t<!is_matcher<V>::value>,
              typename = invoke_result_type<Pred, V&&, T...>>
    operator V&&() const; // { return *this; }

This was to force a link error and so find which function(s) were referencing this operator.

Given the command

g++ -std=c++17 -o test issue_270.cpp -Ifmt/include/ -Itrompeloeil/include -DFMT_HEADER_ONLY

this is the output (reformatted):

/usr/bin/ld: /tmp/cc1U22RE.o: in function
`fmt::v8::format_arg_store<
  fmt::v8::basic_format_context<fmt::v8::appender, char>,
  std::remove_cv<
    std::remove_reference<
      trompeloeil::predicate_matcher<
        trompeloeil::lambdas::regex_check,
        trompeloeil::lambdas::regex_printer,
        trompeloeil::duck_typed_matcher<
          trompeloeil::lambdas::regex_check,
          std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >,
        std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&
    >::type
  >::type
>
fmt::v8::make_format_args<
  fmt::v8::basic_format_context<fmt::v8::appender, char>,
  trompeloeil::predicate_matcher<
    trompeloeil::lambdas::regex_check,
    trompeloeil::lambdas::regex_printer,
    trompeloeil::duck_typed_matcher<
      trompeloeil::lambdas::regex_check,
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >,
    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&>(
      trompeloeil::predicate_matcher<
        trompeloeil::lambdas::regex_check,
        trompeloeil::lambdas::regex_printer,
        trompeloeil::duck_typed_matcher<
          trompeloeil::lambdas::regex_check,
          std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >,
      std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)':
issue_270.cpp:(.text._ZN3fmt2v816make_format_argsINS0_20
basic_format_contextINS0_8appenderEcEEJRKN11trompeloeil17
predicate_matcherINS5_7lambdas11regex_checkENS7_13regex_printerENS5_18
duck_typed_matcherIS8_JNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEJSG_EEEEEENS0_16
format_arg_storeIT_JDpNSt9remove_cvINSt16remove_referenceIT0_E4typeEE4typeEEEEDpOSP_[_ZN3
fmt2v816make_format_argsINS0_20basic_format_contextINS0_8appenderEcEEJRKN11trompeloeil17
predicate_matcherINS5_7lambdas11regex_checkENS7_13regex_printerENS5_18duck_typed_matcher
IS8_JNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEJSG_EEEEEENS0_16
format_arg_storeIT_JDpNSt9remove_cvINSt16remove_referenceIT0_E4typeEE4typeEEEEDpOSP_]+0x34):
undefined reference to
`trompeloeil::duck_typed_matcher<
  trompeloeil::lambdas::regex_check,
  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::
  operator char const*&&<char const*, void, bool>() const'
collect2: error: ld returned 1 exit status

Now to find out why the operator was invoked.

AndrewPaxie commented 2 years ago

This reduced test case avoids the Trompeloeil mocking machinery and reproduces a segmentation fault, presumably the same fault as the submitted test case:

#include <fmt/format.h>
#include <trompeloeil.hpp>

#include <iostream>
#include <type_traits>

namespace trompeloeil {

template <typename T>
struct printer<T, typename std::enable_if<fmt::is_formattable<T>::value>::type>
{
    static void print(std::ostream& os, const T& t)
    {
        os << fmt::format("{}", t);
    }
};

} // namespace trompeloeil

int main()
{
  auto r = trompeloeil::re("A");
  trompeloeil::printer<decltype(r)>::print(std::cout, r);
  return 0;
}
AndrewPaxie commented 2 years ago

As a total aside to this investigation, SFINAE in partial specializations is not documented in the C++ language standard, according to this article on cppreference: https://en.cppreference.com/w/cpp/language/sfinae, which references CWG issue 2054.

No complaints from any of the compilers I have used so far to investigate this issue.

rollbear commented 2 years ago

I just started looking into this, and curiously the reduced example seems to work with fmt trunk, but crashes with 8.1.

https://godbolt.org/z/xc6hWoY5d

rollbear commented 2 years ago

I'm beginning to think fmt::is_formattable<> is broken in 8.1 and earlier. If I change the criteria to the result of trying to parse the formatting string for the type, then it works:

namespace trompeloeil {

template <typename T>
struct printer<T, std::void_t<decltype(std::declval<fmt::formatter<T>&>().parse(std::declval<fmt::format_parse_context&>()))>>
{
    static void print(std::ostream& os, const T& t)
    {
        os << fmt::format("{}", t);
    }
};

} // namespace trompeloeil

https://godbolt.org/z/4Y1PrT5q3

rollbear commented 2 years ago

I think fmt otherwise concludes that trompeloeil::re is printable as a C-string, since it wants to instantiate the conversion to const char* operator, which triggers the infinite recursion.

AndrewPaxie commented 2 years ago

This was the conclusion I got to as well, and two options within Trompeloeil suggest themselves.

The first is to include a static_assert in operator V&&() const, designed to fail if ever the operator is selected in an evaluated expression context. The failure message would have to be suitably helpful,

    template <typename V,
              typename = detail::enable_if_t<!is_matcher<V>{}>,
              typename = invoke_result_type<Pred, V&&, T...>>
    operator V&&() const { 
      // Always fail if called in an evaluated context.
      static_assert(is_matcher<V>{},
        "Only use operator V&&() const in an unevaluated context");
      return *this;
    }

The second option might be more risky, but it is to provide an operator const char* for the regex matcher., which would be selected ahead of the duck_type_matcher. My thinking didn't get as far as a suitable implementation.

rollbear commented 2 years ago

I don't think I like the operator const char* idea. Some library will do a similar thing but with std::string_view instead, or something similar.

I'm a bit curious about why the operator V&&() const is implemented. It doesn't have to be. The implementation was added with a commit comment saying "Resolve lingering warnings", about "-Wzero-as-null-pointer-constant" (1067123c91a8fdb365356d88f15d306255a8dc53). Is there another work around for that warning instead?

Not implementing it leads to a maybe somewhat confusing linking error, but I think that's less bad than a runtime crash. Maybe static_assert is better, since we get a chance to explain "why" to the poor user getting the error.

Unfortunately I believe the original problem is caused by a bug in fmt::is_formattable<>. It appears to be fixed in 9.0.0, (not available on compiler explorer, so only did a cursory check) and is definitely fixed on master.

AndrewPaxie commented 2 years ago

@rollbear: I don't think I like the operator const char* idea. Some library will do a similar thing but with std::string_view instead, or something similar.

I don't like the idea myself.

The actual "lingering warnings" being resolved in early versions of the Clang compilers were -Wundefined-func-template (clang++ 3.9) and -Wundefined-internal (clang++ 3.5, 3.6, 3.7, 3.8, 3.9). The -Wzero-as-null-pointer-constant warnings were generated in clang++ 10 and 11.

In resolving these warnings I ended up relying on this section regarding implicit instantiation in cppreference.com as being an accurate reflection of the standard (I haven't confirmed this though):

cppreference.com, "Implicit instantiation," in "Function template," last modified 26 July 2022. Available: https://en.cppreference.com/w/cpp/language/function_template Accessed: 21 August 2022

Implicit instantiation

When code refers to a function in context that requires the function definition to exist, or if the existence of the definition affects the semantics of the program, and this particular function has not been explicitly instantiated, implicit instantiation occurs. The list of template arguments does not have to be supplied if it can be deduced from context.

The existence of a definition of function is considered to affect the semantics of the program if the function is needed for constant evaluation by an expression, even if constant evaluation of the expression is not required or if constant expression evaluation does not use the definition.

(reading as C++11 or later)

Since a definition appears to be required by the language, I decided to give a minimal implementation, albeit one that has proved "worse" than not having one at all. I hope the static_assert addition adds sufficient additional context, while avoiding both a linking error and a runtime crash.

Thank you for discovering the changes made to fmt since 8.1. I am especially curious to see if they have "knocked out" an implicit conversion to const char* during the course of enabling a specific set of function overloads.

AndrewPaxie commented 2 years ago

I think this is the commit in fmtlib/fmt that fixed things for their library: https://github.com/fmtlib/fmt/commit/8a21e328b8dcb62a2901c499598366a0f5f3f4a5

rollbear commented 2 years ago

I will need to digest the standard. It feels odd to me that a function definition is required, but I can't argue against it. Does this mean we should implement all the other conversion operators too, that are only ever used in unevaluated context to match parameters in expectations? Be that as it may, though, a static_assert gives us an opportunity to provide a helpful message, so how do we best use this?

rollbear commented 2 years ago

So I went ahead to implement the static_assert "solution", and it turns out that clang-3.x really doesn't like it. I don't understand what's happening here. Does 3.x compile the function body anyway, and then discards the result? I guess one solution could be to just #ifdef out the assert, but that doesn't feel very good either.

https://github.com/rollbear/trompeloeil/actions/runs/3314469508/jobs/5473807228#step:7:287

https://godbolt.org/z/4x7oT1d9q

rollbear commented 2 years ago

I clamped my nose and added the #ifs. for clang-3.x. The compilation error message links to this issue which should at least help whomever stumbles upon the problem in the future. I consider this "fixed" now, to the extent it can be, and will close the issue when the next release is tagged.

rollbear commented 2 years ago

Just released v43 which includes the static_assert, so closing now.