pboettch / json-schema-validator

JSON schema validator for JSON for Modern C++
Other
466 stars 134 forks source link

segmentation fault in json_validator::validate if using _LIBCPP_DEBUG #177

Closed eike-fokken closed 2 years ago

eike-fokken commented 2 years ago

Hi!

If I compile with clang, libc++ (llvms implementation of the C++ standard library) and there with the debug version, there is a segfault in json_validator::validate.

The essence is this: If you compile with clang++ -stdlib=libc++ -D_LIBCPP_DEBUG=1, depending on the (valid) schema provided, there are segfaults in validate.

int main() {
json j = ...;
json schema = ...;
  // check that the schema itself is valid:
  json_validator validator;

  try {
    validator.set_root_schema(schema);
  } catch (const std::exception &e) {
    std::cout << "Couldn't validate the data, because the schema itself is faulty: "
      << e.what() << "\n";
  }
  // check that the schema itself is valid:
  json_validator validator;
  validator.set_root_schema(schema);

  // validate:
  validator.validate(j);
    }

Unfortunately I am not good with schemas, therefore my minimal example is not thaaat minimal and I have put it into a git repo:

https://github.com/eike-fokken/debug_segfault_libcpp_debug

you can trigger the segfault by:

git clone --recursive git@github.com:eike-fokken/debug_segfault_libcpp_debug.git
cd debug_segfault_libcpp_debug
cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_COMPILER=clang++
cmake --build build && build/test 
eike-fokken commented 2 years ago

Note that I am compiling on Linux. On Macos this would not compile because Apple doesn't ship the debug version of libc++.

eike-fokken commented 2 years ago

I'm not sure whether it helps, but here are backtraces from gdb, one normal one and one full backtrace.

backtrace.txt

full_backtrace.txt

pboettch commented 2 years ago

I'm afraid it seems you're hitting a bug in libc++ or I'm having an undefined behavior in my code which I'm not aware (also likely):

I tackled the crash down to this:

std::cerr << "instance is below minimum of " << minimum_.second << "\n";
std::cerr << "instance is below minimum of " << std::to_string(minimum_.second) << "\n";
auto v = std::to_string(minimum_.second);
std::cerr << "instance is below minimum of " << v << "\n";
std::cerr << "instance is below minimum of " + v << "\n";
std::cerr << "instance is below minimum of " + std::to_string(minimum_.second) << "\n";

e.error(ptr, instance, "instance is below minimum of " + std::to_string(minimum_.second));

It was the e.error-line which was crashing first. Now everything is working up to the last std::cerr-line.

It seems that there is a problem with literal_string + std::to_string-result when inside a templated function? Not sure the templace plays a role. I'll try something.

pboettch commented 2 years ago

Instead of using maxmum==minimum, could you use const in your schema?

pboettch commented 2 years ago

And your schema seems not work. I'm getting the error (where it crashes) and if I read the schema correctly it should be fine (there is one value where x = 0 and one where x = 1).

pboettch commented 2 years ago

I created a snippet which crashes for me (no more dependency to json or schema):

#include <iostream>

int main()
{
    std::pair<bool, double> minimum_{true, 1};
    std::cerr << "instance is below minimum of " + std::to_string(minimum_.second) << "\n";

    return 0;
}

Does it crash for you?

eike-fokken commented 2 years ago

Instead of using maxmum==minimum, could you use const in your schema?

I guess? I have really no real idea about schemas. A colleague of mine who is no longer in my group wrote our schema-generation code. Right now I don't really have time to dive into it. Is it a bad idea to use minimum==maximum?

eike-fokken commented 2 years ago

Does it crash for you?

It does. That means, it's a bug with libcpp, right?

pboettch commented 2 years ago

I'd say so. Maybe it is forbidden by a C++ rule I'm not aware of.

eike-fokken commented 2 years ago

Thank you for your time! I'll file the bug at libcpp.

pboettch commented 2 years ago

Please keep me posted about that problem.

eike-fokken commented 2 years ago

will do

On Mon, 2021-10-18 at 06:10 -0700, Patrick Boettcher wrote:

Please keep me posted about that problem. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

eike-fokken commented 2 years ago

Here it is:

https://bugs.llvm.org/show_bug.cgi?id=52209

pboettch commented 2 years ago

You're saying that it is const char * + something. I think it is rather related to the std::pair and std::to_string.

eike-fokken commented 2 years ago

You're comment prompted me to dive a little deeper and it seems to have something to do with short string optimization (the problem vanishes, if the total size of the string is below or equal to 22). If you like, have a look in the llvm bugtracker, I've updated the issue there. In case you want to implement a workaround before this gets fixed in libcpp, I think you could just replace

std::cerr << "instance is below minimum of " + std::to_string(minimum_.second) << "\n";

with

std::cerr << "instance is below minimum of " << std::to_string(minimum_.second) << "\n";

But I can understand if you would rather not touch the code for a bug in another project.

pboettch commented 2 years ago

I can't use << because I pass it as an argument to a function which takes a string - so I have to concatenate it.

And yes, fixing something in my library which is clearly a serious bug (if it is) in a standard-lib.

The problem seems to disappear if you use a straight double instead of the pair's second.

eike-fokken commented 2 years ago

The problem seems to disappear if you use a straight double instead of the pair's second.

Maybe it's two bugs. I have some example programs in the bugzilla thread, where I use

std::cout << "instance is below minimum of " + std::to_string(1.0) << "\n";

and it still triggers the bug. Whatever the reason, I'll not think about it further for now.

pboettch commented 2 years ago

Ah OK, good to know