Closed MichaelChirico closed 4 years ago
The potential kibosh for this would be encoding issues, though rawToChar
also would not work for that case, so users with encoding issues can do iconv
themselves I guess?
Still working on adapting my code to use RcppSimdJson
& benchmarking, so another musing for the day --
I think a major choke point of my current code is some regular gc()
s that are happening, which skipping rawToChar
could potentially avert. IINM I am getting rawToChar
on a huge string on every batch from GET
, then parsing out my data & "discarding" the rather large strings (consisting of JSON objects with maybe 100s or rows and/or columns) which are now in the session's string cache (since rawToChar
will do mkChar
).
It's also something to keep in mind for benchmarking -- unless this phenomenon is captured, the benefit of dropping rawToChar
might be understated.
Done in #36
As identified in this Twitter thread:
https://twitter.com/michael_chirico/status/1280656819606548480
See this Gist:
https://gist.github.com/MichaelChirico/f5e09ab9f5f437bb0286e8a42941a3e1
The performance of
fparse
is already damn impressive, but let's see if we can't do a mite better 😎JSON as raw can be retrieved like so:
IINM from C++ POV this
raw
vector should just be a subset of acharacter
vector...