miloyip / nativejson-benchmark

C/C++ JSON parser/generator benchmark
MIT License
1.95k stars 262 forks source link

Boost.JSON output for numbers with scientific format is incorrect #156

Open sicheste opened 1 year ago

sicheste commented 1 year ago

When enabling in main.cpp lines 1139 and 1140 it is showing the differences in parsing. For the Boost.JSON it will for example print this:

Expect: [5e-324]
Actual: [5E-324]

Intput from roundtrip24.json is simply [5e-324].

Somewhere the formatting is broken. By default Boost.JSON would output e instead of E:

#include <boost/json/src.hpp>
#include <iostream>
#include <string>

int main(int, char**) {
  std::string json("[5e-342]");
  std::string result = boost::json::serialize(json);
  std::cout << "input: " << json << "\nresult: " << result << std::endl;
  return 0;
}

Output:

input: [5e-342]
result: "[5e-342]"

The exponent e is lower case here. I did not had enough time to search for some formatting settings in the application.

niXman commented 1 year ago

@sicheste

Expect: [5e-324] Actual: [5E-324]

is there any reason except aesthetic?

and why you are posting it here instead of the boost.json developer project?

sicheste commented 1 year ago

is there any reason except aesthetic?

Yes, the test application is treating this as an invalid result. It is not making a string to lower or similar before comparing the result to the expected value.

and why you are posting it here instead of the boost.json developer project?

Because this formatting is set somewhere within the test application. As you can se in my minimal example from above, it is not caused by Boost.JSON. Maybe it is set by another JSON library. I do not know why, I have not digged deeper. But it is not the default behaviour of Boost.JSON.

niXman commented 1 year ago

When enabling in main.cpp

where is that file?

sicheste commented 1 year ago

When enabling in main.cpp

where is that file?

https://github.com/miloyip/nativejson-benchmark/blob/master/src/main.cpp#L1139

niXman commented 1 year ago

but according to the JSON grammar: https://www.json.org/json-en.html, the exponent char may be one of e or E. so from a formal point of view everything is correct here.

sicheste commented 1 year ago

But then this means that the comparision within the benchmark application is incorrect. So Boost.JSON is blamed to be not be fully following the standard by mistake.

I am totally fine to rephrase the title of this issue. But currently Boost.JSON is giving a result which seems correct, the test application does not accept it as being correct. When, regarding to the JSON specificition, it is fine to use e or E, then both should be accepted.

niXman commented 1 year ago

first - according to the doc it seems your example uses boost::json::serialize(string) which call for serialize_impl(), reset(string *) and finally uses write_string() which do nothing with doubles except for escaping the string.

second - try to change your code that it parse the string first and serialize it back as string. and then show the result.

niXman commented 1 year ago

the E comes from here: https://github.com/boostorg/json/blob/master/include/boost/json/detail/ryu/impl/d2s.ipp#L612