yhirose / cpp-httplib

A C++ header-only HTTP/HTTPS server and client library
MIT License
12.77k stars 2.26k forks source link

AddressSanitizer: stack-overflow in regex_executor.tcc #1264

Open pietroborrello opened 2 years ago

pietroborrello commented 2 years ago
Describe the bug

AddressSanitizer: stack-overflow in regex_executor.tcc

To Reproduce

Built cpp-httplib using clang-10 according to the oss-fuzz script with CXXFLAGS='-O1 -fsanitize=address -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr'

commit: 696239d6e1fa93d5a5ea8b37fac1a55c2da162fc

testcase: httplib-server_fuzzer.zip

ASAN Output (trimmed since it was too long)
$ ./httplib-server_fuzzer id:000002,sig:11,src:000873,time:46734119,op:havoc,rep:4,trial:4
Reading 8611 bytes from .id:000002,sig:11,src:000873,time:46734119,op:havoc,rep:4,trial:4
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2678705==ERROR: AddressSanitizer: stack-overflow on address 0x7fffff7fedc8 (pc 0x00000049ac19 bp 0x7fffff7ff610 sp 0x7fffff7fedd0 T0)
    #0 0x49ac19 in __asan_memcpy (server_fuzzer+0x49ac19)
    #1 0x69c9e6 in std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_M_rep_once_more(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_Match_mode, long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex_executor.tcc:181:18
    #2 0x698da7 in std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_M_handle_repeat(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_Match_mode, long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex_executor.tcc:212:4
    #3 0x698a42 in std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_M_dfs(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_Match_mode, long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex_executor.tcc:515:4
    #4 0x69a694 in std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_M_handle_match(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_Match_mode, long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex_executor.tcc:329:8
    [...]
    #246 0x69cb7d in std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_M_rep_once_more(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_Match_mode, long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex_executor.tcc:184:4
    #247 0x698da7 in std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_M_handle_repeat(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_Match_mode, long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex_executor.tcc:212:4
    #248 0x698a42 in std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_M_dfs(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::__cxx11::regex_traits<char>, true>::_Match_mode, long) /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex_executor.tcc:515:4

SUMMARY: AddressSanitizer: stack-overflow (server_fuzzer+0x49ac19) in __asan_memcpy
==2678705==ABORTING
yhirose commented 2 years ago

@pietroborrello, thanks for the report. Does this problem only happen with the commit 696239d?

pietroborrello commented 2 years ago

Thank you for your reply. That is the commit I tested the fuzzer on, not sure on master, but it is really a recent commit

yhirose commented 2 years ago

@pietroborrello, your log shows that the problem is happening at /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex_executor.tcc:181:18. Could you provide more information to prove that it's caused by cpp-httplib? Thanks!

pietroborrello commented 2 years ago

Hello, I confirm the issue seems to reproduce on the current master. The ASAN stack trace does not provide full information, sorry for the confusion. The full (huge) stack trace under gdb shows the issue is triggered by httplib::Server::dispatch_request calling std::regex_match on what seems a worst-case scenario that incurs a stack-overflow.

https://github.com/yhirose/cpp-httplib/blob/1be1b3a86ddcf429e72cb444dcb03068a73ad166/httplib.h#L5500

#33534 0x0000000000665f28 in std::__detail::__regex_algo_impl<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, char, std::__cxx11::regex_traits<char>, (std::__detail::_RegexExecutorPolicy)0, true> (__s=0x2f, __e=0x0, __m=..., __re=..., __flags=(unknown: 0)) at /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex.tcc:80
#33535 0x000000000066526e in std::regex_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, char, std::__cxx11::regex_traits<char> > (__s=0x2f, __e=0x0, __m=..., __re=..., __flags=(unknown: 0)) at /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex.h:2066
#33536 0x000000000066450c in std::regex_match<std::char_traits<char>, std::allocator<char>, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, char, std::__cxx11::regex_traits<char> > (__s=Python Exception <class 'gdb.error'> There is no member named _M_dataplus.: 
, __m=..., __re=..., __flags=(unknown: 0)) at /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/regex.h:2142
#33537 0x00000000006d9175 in httplib::Server::dispatch_request(httplib::Request&, httplib::Response&, std::vector<std::pair<std::__cxx11::basic_regex<char, std::__cxx11::regex_traits<char> >, std::function<void (httplib::Request const&, httplib::Response&)> >, std::allocator<std::pair<std::__cxx11::basic_regex<char, std::__cxx11::regex_traits<char> >, std::function<void (httplib::Request const&, httplib::Response&)> > > > const&) (this=0x12b1600 <g_server>, req=..., res=..., handlers=std::vector of length 1, capacity 1 = {...}) at ../../httplib.h:5500
#33538 0x00000000005ee6ed in httplib::Server::routing (this=0x12b1600 <g_server>, req=..., res=..., strm=...) at ../../httplib.h:5479
#33539 0x00000000005e2a1d in httplib::Server::process_request(httplib::Stream&, bool, bool&, std::function<void (httplib::Request&)> const&) (this=0x12b1600 <g_server>, strm=..., close_connection=0x0, connection_closed=@0x7fffffffd880: 0x1, setup_request=...) at ../../httplib.h:5728
#33540 0x00000000004d6af5 in FuzzableServer::ProcessFuzzedRequest (this=0x12b1600 <g_server>, stream=...) at server_fuzzer.cc:49
#33541 0x00000000004d6494 in LLVMFuzzerTestOneInput (data=0x625000005100 "POST /fform%u008ano", '\324' <repeats 8072 times>, "\343m%u0\034\070a\201g HTTP/1.0\r\nDon", size=0x21a3) at server_fuzzer.cc:86
#33542 0x00000000007494fa in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) ()
#33543 0x00000000007345ca in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) ()
#33544 0x00000000007394d3 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) ()
#33545 0x0000000000734353 in main ()
yhirose commented 2 years ago

@pietroborrello, I was trying to reproduce this problem on my machine, but I couldn't... What I did was to copy your test case to test/fuzzing/corpus as issue1264, and run make fuzz_test with your CXXFLAGS setting. However, it didn't report any error message...

Here is the result. Did I miss something?

[master] ~/cpp-httplib/test$ make fuzz_test
./server_fuzzer fuzzing/corpus/*
Reading 32 bytes from fuzzing/corpus/1
Execution successful
Reading 159 bytes from fuzzing/corpus/2
Execution successful
Reading 54 bytes from fuzzing/corpus/3
Execution successful
Reading 1041826 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5372331946541056
Execution successful
Reading 787294 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5386708825800704
Execution successful
Reading 317 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5667822731132928
Execution successful
Reading 395 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5942767436562432
Execution successful
Reading 878956 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-6508706672541696
Execution successful
Reading 8611 bytes from fuzzing/corpus/issue1264
Execution successful
Execution finished
pietroborrello commented 2 years ago

I am not sure why it does not work for you, to reproduce on my machine I do exactly this (commit fee8e97b4eeb34fe2e6e6294413d84e9e7a072a7) :

$ CXX=clang++ CXXFLAGS='-O1 -fsanitize=address -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr' LIB_FUZZING_ENGINE='./libFuzzer.a' make -j8 server_fuzzer

$ ./server_fuzzer ./corpus/issue1264 
INFO: Seed: 234439723
./server_fuzzer: Running 1 inputs 1 time(s) each.
Running: ./corpus/issue1264
Segmentation fault

I am not sure what are you using as LIB_FUZZING_ENGINE and whether or not it affects it. My libFuzzer.a is obtained just by compiling it from llvm, in the ossfuzz/fuzzbench way but without docker. I attach both the script to generate it, and the library itself.

libfuzzer_helper.zip

pietroborrello commented 2 years ago

Uh I confirm that if I use the standalone_fuzz_target_runner.cpp provided as LIB_FUZZING_ENGINE, it does not crash. I'm not sure what the official llvm driver does to affect this issue to be exposed.

pietroborrello commented 2 years ago

I can also confirm it reproduces when compiling with AFLplusplus with default configurations:

$ CXX=./AFLplusplus/afl-clang-fast++ CXXFLAGS='-O1 -fsanitize=address -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr' LIB_FUZZING_ENGINE='AFLplusplus/libAFLDriver.a' make -j8 server_fuzzer
pietroborrello commented 2 years ago

Ok the problem is that in standalone_fuzz_target_runner.cpp you do not call LLVMFuzzerInitialize which is actually used by the harness to setup different stuff. While both AFL++ and libfuzzer driver have that by default. If I add the call to LLVMFuzzerInitialize in standalone_fuzz_target_runner.cpp it reproduces also there.

https://github.com/yhirose/cpp-httplib/blob/fee8e97b4eeb34fe2e6e6294413d84e9e7a072a7/test/fuzzing/server_fuzzer.cc#L56

yhirose commented 2 years ago

@omjego, do you have any thought for the previous comment by @pietroborrello, since I am not familiar to the code very much. Thanks for taking your time.

@pietroborrello, could you tell me where LLVMFuzerInitialize should be called in standalone_fuzz_target_runner.cpp? I'll try the same. Thanks!

pietroborrello commented 2 years ago

I have added it as the first instruction inside the for loop, but I guess it could be also called just once at the beginning of the main function. It depends whether it must be run at the beginning of every testcase of just once, not sure about the semantics here.

// Copyright 2017 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");

// This runner does not do any fuzzing, but allows us to run the fuzz target
// on the test corpus or on a single file,
// e.g. the one that comes from a bug report.

#include <cassert>
#include <iostream>
#include <fstream>
#include <vector>

// Forward declare the "fuzz target" interface.
// We deliberately keep this inteface simple and header-free.
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size);
extern "C" int LLVMFuzzerInitialize(int * /*argc*/, char *** /*argv*/);  // <-------------

// It reads all files passed as parameters and feeds their contents
// one by one into the fuzz target (LLVMFuzzerTestOneInput).
int main(int argc, char **argv) {
  for (int i = 1; i < argc; i++) {
    LLVMFuzzerInitialize(&argc, &argv);  // <-------------
    std::ifstream in(argv[i]);
    in.seekg(0, in.end);
    size_t length = static_cast<size_t>(in.tellg());
    in.seekg (0, in.beg);
    std::cout << "Reading " << length << " bytes from " << argv[i] << std::endl;
    // Allocate exactly length bytes so that we reliably catch buffer overflows.
    std::vector<char> bytes(length);
    in.read(bytes.data(), static_cast<std::streamsize>(bytes.size()));
    LLVMFuzzerTestOneInput(reinterpret_cast<const uint8_t *>(bytes.data()),
                           bytes.size());
    std::cout << "Execution successful" << std::endl;
  }
  std::cout << "Execution finished" << std::endl;
  return 0;
}
yhirose commented 2 years ago

@pietroborrello, I added the call, but I still don't get the same result as yours... I probably should try on linux instead of macOS?

[master] ~/Projects/cpp-httplib/test$ make fuzz_test
clang++ -o standalone_fuzz_target_runner.o -I.. -O1 -std=c++11 -I. -Wall -Wextra -Wtype-limits -Wconversion -fsanitize=address '-fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr' -c fuzzing/standalone_fuzz_target_runner.cpp
clang++ -o server_fuzzer -I.. -O1 -std=c++11 -I. -Wall -Wextra -Wtype-limits -Wconversion -fsanitize=address '-fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr' fuzzing/server_fuzzer.cc -DCPPHTTPLIB_OPENSSL_SUPPORT -I/usr/local/opt/openssl@1.1/include -L/usr/local/opt/openssl@1.1/lib -lssl -lcrypto -DCPPHTTPLIB_ZLIB_SUPPORT -lz -DCPPHTTPLIB_BROTLI_SUPPORT -I/usr/local/opt/brotli/include -L/usr/local/opt/brotli/lib -lbrotlicommon -lbrotlienc -lbrotlidec standalone_fuzz_target_runner.o -pthread
ld: warning: dylib (/usr/local/opt/openssl@1.1/lib/libssl.dylib) was built for newer macOS version (12.0) than being linked (11.1)
ld: warning: dylib (/usr/local/opt/openssl@1.1/lib/libcrypto.dylib) was built for newer macOS version (12.0) than being linked (11.1)
./server_fuzzer fuzzing/corpus/*
server_fuzzer(50705,0x109a3f600) malloc: nano zone abandoned due to inability to preallocate reserved vm space.
Reading 32 bytes from fuzzing/corpus/1
Execution successful
Reading 159 bytes from fuzzing/corpus/2
Execution successful
Reading 54 bytes from fuzzing/corpus/3
Execution successful
Reading 1041826 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5372331946541056
Execution successful
Reading 787294 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5386708825800704
Execution successful
Reading 317 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5667822731132928
Execution successful
Reading 395 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5942767436562432
Execution successful
Reading 878956 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-6508706672541696
Execution successful
Reading 8611 bytes from fuzzing/corpus/issue1264
Execution successful
Execution finished
pietroborrello commented 2 years ago

Ah yes, I'm on ubuntu 18.04 I assumed the crash was OS insensitive but maybe is not the case

yhirose commented 2 years ago

@pietroborrello, since I don't have a linux desktop machine, I tested on ubuntu 20.04 on vagrant. But I am still not able to reproduce this problem. Can you reproduce the problem by running make fuzz_test command in test directory?

vagrant@ubuntu-focal:~/cpp-httplib/test$ make fuzz_test
clang++ -o standalone_fuzz_target_runner.o -I.. -O1 -fsanitize=address -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr -c fuzzing/standalone_fuzz_target_runner.cpp
clang++ -o server_fuzzer -I.. -O1 -fsanitize=address -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr fuzzing/server_fuzzer.cc -DCPPHTTPLIB_OPENSSL_SUPPORT -I/usr/local/opt/openssl@1.1/include -L/usr/local/opt/openssl@1.1/lib -lssl -lcrypto -DCPPHTTPLIB_ZLIB_SUPPORT -lz -DCPPHTTPLIB_BROTLI_SUPPORT -I/usr/local/opt/brotli/include -L/usr/local/opt/brotli/lib -lbrotlicommon -lbrotlienc -lbrotlidec standalone_fuzz_target_runner.o -pthread
./server_fuzzer fuzzing/corpus/*
Reading 32 bytes from fuzzing/corpus/1
Execution successful
Reading 159 bytes from fuzzing/corpus/2
Execution successful
Reading 54 bytes from fuzzing/corpus/3
Execution successful
Reading 1041826 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5372331946541056
Execution successful
Reading 787294 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5386708825800704
Execution successful
Reading 317 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5667822731132928
Execution successful
Reading 395 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-5942767436562432
Execution successful
Reading 878956 bytes from fuzzing/corpus/clusterfuzz-testcase-minimized-server_fuzzer-6508706672541696
Execution successful
Reading 8611 bytes from fuzzing/corpus/issue1264
Execution successful
Execution finished
pietroborrello commented 2 years ago

So the way I'm compiling is

$ CXX=clang++ CXXFLAGS='-O1 -fsanitize=address -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr' LIB_FUZZING_ENGINE='./standalone_fuzz_target_runner.cpp' make -j8 server_fuzzer

Directly in the test/fuzzing folder. This produces the compilation command:

$ clang++ -O1 -fsanitize=address -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr -ggdb -O0 -std=c++11 -DGTEST_USE_OWN_TR1_TUPLE -I../.. -I. -Wall -Wextra -Wtype-limits -Wconversion -o server_fuzzer  server_fuzzer.cc  -DCPPHTTPLIB_ZLIB_SUPPORT -lz  ./standalone_fuzz_target_runner.cpp -pthread

Maybe the different variables that the Makefile sets in test prevent the bug to show up? I'll test it on Monday

pietroborrello commented 2 years ago

So the bug does not reproduce in the Makefile in test since the CXXFLAGS I sent contain -O1, which in the Makefile in test/fuzzing is overridden by the -O0. And the bug showing seems to be affected by the optimization level.

https://github.com/yhirose/cpp-httplib/blob/fee8e97b4eeb34fe2e6e6294413d84e9e7a072a7/test/fuzzing/Makefile#L4

While in the test Makefile this is not present. I am able to reproduce the bug using the Makefile in test, just by setting CXXFLAGS = -g -fsanitize=address -fsanitize=undefined -std=c++11 -I. -Wall -Wextra -Wtype-limits -Wconversion in the makefile here:

https://github.com/yhirose/cpp-httplib/blob/fee8e97b4eeb34fe2e6e6294413d84e9e7a072a7/test/Makefile#L2

pietroborrello commented 2 years ago

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164 seems to describe a similar issue, which basically means that in theory, it is unsafe to use std::regex with possibly unbounded inputs and wildcard matchings. The issue refers to sizes over 27Kb, but here it seems to be the case with a much smaller input. Maybe the fix would be to use a different regex match library or manually bounding the input to a safe size.

yhirose commented 2 years ago

@pietroborrello, thanks for the additional information. I am now able to reproduce it.

Maybe the fix would be to use a different regex match library or manually bounding the input to a safe size.

This library takes an advantage of the regular expression to parse and retrieve path components as below. So we cannot be away from it. Also using a different regex library isn't an option either, since it'll add an additional dependency. (I only accept such external library dependency for compression and SSL like zlib, brotli and openssl.

  svr.Get(R"(/numbers/(\d+))", [&](const Request& req, Response& res) {
    auto numbers = req.matches[1];
    res.set_content(numbers, "text/plain");
  });

It seems like this problem is caused by the following code in test/fuzzing/server_fuzzer.cc:

extern "C" int LLVMFuzzerInitialize(int * /*argc*/, char *** /*argv*/) {
  g_server.Get(R"(.*)",
               [&](const httplib::Request & /*req*/, httplib::Response &res) {
                 res.set_content("response content", "text/plain");
               });
  g_server.Post(R"(.*)",
                [&](const httplib::Request & /*req*/, httplib::Response &res) {
                  res.set_content("response content", "text/plain");
                });
  g_server.Put(R"(.*)",
               [&](const httplib::Request & /*req*/, httplib::Response &res) {
                 res.set_content("response content", "text/plain");
               });
  g_server.Patch(R"(.*)",
                 [&](const httplib::Request & /*req*/, httplib::Response &res) {
                   res.set_content("response content", "text/plain");
                 });
  g_server.Delete(
      R"(.*)", [&](const httplib::Request & /*req*/, httplib::Response &res) {
        res.set_content("response content", "text/plain");
      });
  g_server.Options(
      R"(.*)", [&](const httplib::Request & /*req*/, httplib::Response &res) {
        res.set_content("response content", "text/plain");
      });
  return 0;
}

The infinite repetition pattern .* looks causing the stack overflow. I'll take a more look at it to see how I can fix the problem reasonably. Or I may not be able to fix the problem, and leave users a responsibility to avoid such a problematic regex path string...

pietroborrello commented 2 years ago

Thank you for looking into that! Maybe detecting a possibly infinite pattern (and similar) in the regex and warning the user about that would be enough? This seems a configuration issue that may arise just due to handler settings. Not sure if the same pattern could be set in the code somewhere else?

yhirose commented 2 years ago

I am able to reproduce it on my machine with the following simple code:

// a.cpp
#include <regex>

int main(void) {
  auto s = std::string(1024 * 11, '?');
  auto re = std::regex(".*");
  std::smatch m;
  auto ret = std::regex_match(s, m, re);
}
clang++ a.cpp -fsanitize=address a.cpp && ./a.out
yhirose commented 2 years ago

@pietroborrello if you add-DCPPHTTPLIB_REQUEST_URI_MAX_LENGTH=4096, what will happen on your machine?

pietroborrello commented 2 years ago

I am able to reproduce it on my machine with the following simple code:

I confirm that I can reproduce it too with the minimized test

If you add-DCPPHTTPLIB_REQUEST_URI_MAX_LENGTH=4096, what will happen on your machine?

And I confirm that setting the limit to 4096 does not crash on my machine, but I guess this depends on the size of the stack of the system. But this seems pretty reliable, I had to go down to ulimit -s 64 (setting max stack size to 64Kb) to make it crash again with a stack overflow. So I guess if a system has at least 64Kb of stack size, it should be safe.