Open srlch opened 2 days ago
:white_check_mark: pass : 0 / 0 (0%)
:white_check_mark: pass : 0 / 0 (0%)
:white_check_mark: pass : 28 / 33 (84.85%)
path | covered_line | new_line | coverage | not_covered_line_detail | |
---|---|---|---|---|---|
:large_blue_circle: | src/exec/json_parser.h | 0 | 1 | 00.00% | [77] |
:large_blue_circle: | src/exec/json_parser.cpp | 28 | 32 | 87.50% | [59, 67, 86, 91] |
Why I'm doing:
In current implementation, JsonDocumentStreamParser use
simdjson::ondemand::parser::iterate_many
to parse multiple JSON document. This API need caller pass the max size of JSON document called, saysmax_json_lenght_in_file
in a given file to allocate the a memory chunk to finish the parsing process. But the problem is that, the caller pass the file size instead ofmax_json_lenght_in_file
and allocate huge memory chunk (which may not be used) almost 5~6 time of the file size. This is a huge memory amplificationWhat I'm doing:
Introduce
json_parse_many_batch_size
to control the batch_size passed intosimdjson::ondemand::parser::iterate_many
. Ifjson_parse_many_batch_size > 0
, usejson_parse_many_batch_size
as batch size, otherwise usesimdjson::dom::DEFAULT_BATCH_SIZE
. ForJsonDocumentStreamParser::get_current
, parse the doc using a relative small buffer. If an exception is thrown because the buffer is too small, increase the buffer size and retry.Fixes #issue https://github.com/StarRocks/StarRocksTest/issues/8636
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: