Closed DUOLabs333 closed 11 months ago
Even without using this library, I would not expect to be able to have a function that recurses 100,000 times, unless I have a massively huge stack. If you optimize this, you might get away with it due to tail call optimization converting it from recursive to iterative.
If I remove the json argument, then it works as expected --- I can finish the entire execution. In fact, as long as I don't use json in the function, then it works. This means that something is happening when json is being used that causes this (this happens even if I'm just extracting a value from the json, like std::string a=name["glossary"]["title"];
.
You pass by value - this means you create thousands of copies of the json value. Can you try to pass by reference?
I already tried that --- same problem, just happens later.
I see the same behavior with
#include <string>
void print_json(std::string name,int n){
if (n<=0){
return;
}
printf("%s\n",name.c_str());
print_json(name, n-1);
}
int main(int argc, char** argv){
std::string testing = R"({"glossary":{"title":"example glossary","GlossDiv":{"title":"S","GlossList":{"GlossEntry":{"ID":"SGML","SortAs":"SGML","GlossTerm":"Standard Generalized Markup Language","Acronym":"SGML","Abbrev":"ISO 8879:1986","GlossDef":{"para":"A meta-markup language, used to create markup languages such as DocBook.","GlossSeeAlso":["GML","XML"]},"GlossSee":"markup"}}}}})";
print_json(testing,100000);
return 0;
}
As @gregmarr mentioned, you are testing against the stack limits, not against the library.
That's weird --- If I do that (and use references), the program completes just fine, it's just that references with json is not enough to complete the program.
By adding a local variable with a destructor, you are defeating the tail call optimization.
void print_json(std::string const &name,int n){
if (n<=0){
return;
}
std::string copy = name; // If you comment this out, it will succeed
print_json(name,n-1);
}
int main(int argc, char** argv){
std::string testing = R"({"glossary":{"title":"example glossary","GlossDiv":{"title":"S","GlossList":{"GlossEntry":{"ID":"SGML","SortAs":"SGML","GlossTerm":"Standard Generalized Markup Language","Acronym":"SGML","Abbrev":"ISO 8879:1986","GlossDef":{"para":"A meta-markup language, used to create markup languages such as DocBook.","GlossSeeAlso":["GML","XML"]},"GlossSee":"markup"}}}}})";
print_json(testing,100000);
return 0;
}
I understand that, but replacing json
reference with a std::string
reference causes the program to terminate (in both cases, there are no local variables).
However, using -O3
causes this reproducer to succeed. Unfortunately, my actual function is not tail recursive, so the compiler can't optimize for it.
The local variable is here, it's a temporary, but it does appear to be affecting the optimization:
printf("%s\n",name.dump(-1).c_str());
^^^^^^^^^^^^^
Ah, that's why std::string::c_str
works --- it's a pointer, which doesn't need to be deallocated.
Since my function is not tail-recursive, it doesn't seem to apply. However, in my function, I'm not dealing with 10000 iterations, but at most 15, so there's something else going on (I also get a bus error). Given that references alleviate some of the problem tells me that something probably went awry. But I don't think this is due to the library at this point, so I'll close this issue.
A problem though: when I debug applications, I need to get the json string so I can externally parse it --- however, as you can see, if I'm in heavily nested frame, and I try to dump()
a large json object, I get a stack overflow. Is there a way to get the raw char* directly, to avoid making a local variable?
There is no string form of a large object until you generate it using dump()
, and that form is not stored, only returned to you.
When I call test["a"]["b"]["c"]
on a json& test, are new objects created/copied for each of the keys? If so, that could explain the overflow.
test["a"] returns a reference, which you then access by calling ["b"], which returns a reference, which you then access by calling ["c"], which returns a reference. This reference is either const or non-const depending on the const-ness of test. If test is non-const, then it can create new sub-objects, but that's all heap-based. The references may or may not be stack objects, or may just be stored in registers, depending on the optimization, but if they are stack objects, then they are pointer/reference sized.
Description
Exactly what it says in the title --- giving more stack space does not solve the problem. However, if I use references or pointers, the stack could grow more before crashing (but not by much --- roughly a 100 more iterations).
Reproduction steps
Reproducer:
This gives a segmentation fault, but the code I'm using (which is also recursive) gives me a bus error.
Expected vs. actual results
Expected: To be able to run all iterations without crashing.
Minimal code example
No response
Error messages
No response
Compiler and operating system
clang, MacOS 11
Library version
HEAD
Validation
develop
branch is used.