anilmaurya / fast_jsonparser

Fastest Json parser for Ruby, wrapper for simdjson
MIT License
307 stars 10 forks source link

Allow to configure load_many batch size #5

Closed casperisfine closed 4 years ago

casperisfine commented 4 years ago

When trying to parse a ruby heap dump, my script crashes with:

libc++abi.dylib: terminating with uncaught exception of type simdjson::simdjson_error: This parser can't support a document that big
Abort trap: 6

load_many uses a buffer (load_many(const std::string &path, size_t batch_size = DEFAULT_BATCH_SIZE)) which default to 1MB (10^6B). Any document larger than that makes the parser crash.

So I changed load_many to accept a batch_size: parameter. I also had to do a bunch of other changes that I'll explain in comments.

anilmaurya commented 4 years ago

Thank you for contributing. batch_size is great addition 🎖

lemire commented 4 years ago

@casperisfine It crashes because the exception is not caught. Any chance the exception could be caught? (try/catch)

casperisfine commented 4 years ago

@lemire this PR catch it.

lemire commented 4 years ago

+1