Closed billdueber closed 2 months ago
Can you try putting Truffle::Ropes.flatten_rope(data)
on your data after you get it from gzip and before you use it?
Adding previously_gzipped_data = Truffle::Ropes.flatten_rope(previously_gzipped_data)
does the trick (although it doesn't help me for the moment since the calls are locked up in a gem). Hopefully it'll help point you in the right direction.
Thank you for the great report, it looks like pure-Ruby JSON parsing is slower with the String from Zlib::GzipReader
.
Truffle::Ropes.flatten_rope()
seems a good hint that this is an issues with Ropes.
Of course, that's just helpful to debug, it should work fine without any explicit call (Truffle::Ropes.flatten_rope
is a private debug-like API, just like everything else under Truffle::
).
Maybe it has many ConcatRope and the reading accesses in JSON don't force to materialize the byte[]
?
Oj
likely uses RSTRING_PTR() and that forces to flatten the Rope.
@norswap Could you investigate?
@billdueber To make sure the pure-Ruby version of json
, from stdlib, is used, could you show gem list | grep json
on TruffleRuby? (it should be json (default: 2.3.0)
)
If the json
gem is installed, then it will be used as a C extension and preferred to the pure-Ruby version, which is unfortunately rather confusing but that's the semantics of default gems.
It's simpler than that, actually:
p Truffle::CExt.native_string?(previously_gzipped_data) # => true
forced_copy = previously_gzipped_data + " "
p Truffle::CExt.native_string?(forced_copy) # => false
json = JSON.parse(previously_gzipped_data)
p Truffle::CExt.native_string?(previously_gzipped_data) # => true
So the issue when we use previously_gzipped_data
, is that we use a NativeRope instead of a ManagedRope.
And I would expect that String migrates from NativeRope to ManagedRope only once, but that doesn't seem the case since it's still native even after JSON.parse(previously_gzipped_data)
.
Most likely involved, this force_encoding
in JSON:
https://github.com/oracle/truffleruby/blob/e020c89a8368ef2e1b996904f7becf65b375a47e/lib/json/lib/json/pure/parser.rb#L140
previously_gzipped_data.encoding
is UTF-8 (maybe a bit surprising? But MRI behaves the same).
And force_encoding
actually returns a NativeRope when the source is a NativeRope.
The .encode
above the force_encoding
also returns a different String, so the original String is never modified and stays native.
Then this String is passed to StringScanner, and StringScanner#scan is used which will run Regexps against it.
Regexps use a RopeNodes.BytesNode
, which for a NativeRope will allocate a byte[]
and copy from native memory every time, which seems very likely to be the performance issue.
Maybe BytesNode
should take a String and not a Rope, so it could transition the String to managed if it's native only once, and store the ManagedRope back in the String.
Also BytesNode
seems a good indication of "read-heavy workload coming", so it seems a good place to do that.
JSON gem (almost) just as you expected -- I apparently have 2.1.0
✦ ❯ gem list | grep json
json (default: 2.1.0)
Right, that's expected for TruffleRuby 20.3.0, thanks for confirming.
Just pinging on this to see if it's on anyone's radar. I've updated the benchmarking code to explicitly assert that calling Truffle::Debug.flatten_string
(nee Truffle::Ropes.flatten_rope
) addresses the issue (it does, as expected).
Removed Oj because it make the "x slower" comparison so hard to read, but it's about 10x faster than JSON and doesn't suffer from gzip slowdown.
tl;dr is that on my hardware gzdata is about 30x slower on the smaller (array-of-100-arrays) json string, and about 400x slower on the larger json string.
truffleruby 22.3.0, like ruby 3.0.3, GraalVM CE Native [x86_64-darwin]
Base unit is an array of 20 integers
JSON-decode a string encoding an array of 100/1000 of those base units.
Warming up --------------------------------------
JSON 100 plain 19.000 i/100ms
JSON 100 gzdata 1.000 i/100ms
JSON 1000 plain 2.000 i/100ms
JSON 1000 gzdata 1.000 i/100ms
Calculating -------------------------------------
JSON 100 plain 722.230 (±15.2%) i/s - 7.030k in 10.011312s
JSON 100 gzdata 26.395 (±34.1%) i/s - 202.000 in 10.023596s
JSON 1000 plain 73.287 (±13.6%) i/s - 716.000 in 10.003276s
JSON 1000 gzdata 0.166 (± 0.0%) i/s - 2.000 in 12.601048s
Comparison:
JSON 100 plain: 722.2 i/s
JSON 1000 plain: 73.3 i/s - 9.85x (± 0.00) slower
JSON 100 gzdata: 26.4 i/s - 27.36x (± 0.00) slower
JSON 1000 gzdata: 0.2 i/s - 4355.93x (± 0.00) slower
Thank you, we should indeed look at this again, especially since the migration to TruffleString which should help with this. Could you share your new benchmark file? In the gist it seems to be the older benchmark (e.g. using oj and not using benchmark/ips), but the newer benchmark seems much simpler and uses benchmark/ips so I'd gladly use that to investigate. The older benchmark file might still be needed to generate the files though.
I threw up another gist with a less-organically-written benchmark script that strictly compares one set of plain data vs one set of gzipped data. https://gist.github.com/billdueber/87b4f100c4a2d5d1c756470e09615857
Sample output:
✦ ❯ ruby slow_gzip_bench.rb 1000 10 10
truffleruby 22.3.0, like ruby 3.0.3, GraalVM CE Native [x86_64-darwin]
Base unit is an array of 20 integers
Rows: 1000; Warmup seconds: 10; run test for 10 seconds
Warming up --------------------------------------
plain 7.000 i/100ms
gzdata 1.000 i/100ms
Calculating -------------------------------------
plain 92.525 (±10.8%) i/s - 924.000 in 10.134733s
gzdata 0.176 (± 0.0%) i/s - 2.000 in 11.384485s
Comparison:
plain: 92.5 i/s
gzdata: 0.2 i/s - 526.66x (± 0.00) slower
Thank you, I can reproduce it with truffleruby-dev
.
And adding:
if IS_TRUFFLE
FILEDATA["gzdata"] = Truffle::Debug.flatten_string(FILEDATA["gzdata"])
end
makes them both the same speed.
Just checking in again every year-and-a-half or so to see if there's any progress or a "won't-fix" on this. Still using the simple benchmark from https://gist.github.com/billdueber/87b4f100c4a2d5d1c756470e09615857
truffleruby 24.0.1, like ruby 3.2.2, Oracle GraalVM JVM [aarch64-darwin]
Base unit is an array of 20 integers
Rows: 300; Warmup seconds: 5; run test for 5 seconds
truffleruby 24.0.1, like ruby 3.2.2, Oracle GraalVM JVM [aarch64-darwin]
Warming up --------------------------------------
plain 96.000 i/100ms
gzdata 1.000 i/100ms
Calculating -------------------------------------
plain 1.174k (± 4.5%) i/s - 5.856k in 4.999764s
gzdata 15.517 (±12.9%) i/s - 77.000 in 5.026993s
Comparison:
plain: 1174.2 i/s
gzdata: 15.5 i/s - 75.67x slower
Thanks!
One workaround/solution for this is to use the json gem native extension, e.g. with gem install json
or including json
in the Gemfile.
I'm using the latest JVM EA build with ruby-build truffleruby+graalvm-dev
.
$ ruby slow_gzip_bench.rb
Base unit is an array of 20 integers
Rows: 300; Warmup seconds: 5; run test for 5 seconds
truffleruby 24.2.0-dev-7484fe29, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
plain 29.000 i/100ms
gzdata 1.000 i/100ms
Calculating -------------------------------------
plain 505.279 (± 6.5%) i/s (1.98 ms/i) - 2.523k in 5.032136s
gzdata 8.888 (±22.5%) i/s (112.51 ms/i) - 42.000 in 5.024406s
Comparison:
plain: 505.3 i/s
gzdata: 8.9 i/s - 56.85x slower
So this performance issue still exists, although these days it seems because the SwitchEncodingNode in MatchInRegionTRegexNode (vs BytesNode previously).
After gem install json
:
$ ruby slow_gzip_bench.rb
Base unit is an array of 20 integers
Rows: 300; Warmup seconds: 5; run test for 5 seconds
truffleruby 24.2.0-dev-7484fe29, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
plain 10.000 i/100ms
gzdata 10.000 i/100ms
Calculating -------------------------------------
plain 107.540 (± 0.9%) i/s (9.30 ms/i) - 540.000 in 5.021937s
gzdata 106.373 (± 2.8%) i/s (9.40 ms/i) - 540.000 in 5.080027s
Comparison:
plain: 107.5 i/s
gzdata: 106.4 i/s - same-ish: difference falls within error
Same speed for plain & gzdata, but much slower than the pure-Ruby JSON.
But we can also use the new --cexts-panama
option to run C extensions faster:
(without installing the json gem beforehand)
$ ruby --experimental-options --cexts-panama slow_gzip_bench.rb
Base unit is an array of 20 integers
Rows: 300; Warmup seconds: 5; run test for 5 seconds
truffleruby 24.2.0-dev-7484fe29, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
plain 31.000 i/100ms
gzdata 31.000 i/100ms
Calculating -------------------------------------
plain 319.096 (± 1.6%) i/s (3.13 ms/i) - 1.612k in 5.052888s
gzdata 321.968 (± 0.9%) i/s (3.11 ms/i) - 1.612k in 5.007179s
Comparison:
gzdata: 322.0 i/s
plain: 319.1 i/s - same-ish: difference falls within error
Same speed for plain & gzdata, and close to the pure-Ruby JSON.
In general we are working on improving JSON speed as it's been identified as significantly slower than it could be (cc @andrykonchin).
Pure-Ruby JSON and with a fix to avoid repeated native->managed conversions:
$ jt -u jvm-ee ruby slow_gzip_bench.rb
Base unit is an array of 20 integers
Rows: 300; Warmup seconds: 5; run test for 5 seconds
truffleruby 24.2.0-dev-001cccc2*, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
plain 17.000 i/100ms
gzdata 49.000 i/100ms
Calculating -------------------------------------
plain 488.617 (± 9.6%) i/s (2.05 ms/i) - 2.414k in 5.027094s
gzdata 484.828 (±10.9%) i/s (2.06 ms/i) - 2.401k in 5.066445s
Comparison:
plain: 488.6 i/s
gzdata: 484.8 i/s - same-ish: difference falls within error
So that solves this issue, will put up a PR for it.
(first issue opened here; if I'm off the mark or in the wrong spot, my apologies and please let me know)
JSON-parsing data read from a gzipped file with Zlib::GzipReader is very slow. Parsing the same byte-for-byte data read from a non-gzipped file is not slow. Parsing a copy of the data read from a gzipped file is also not slow.
This gist gives a small program that creates a bunch of arrays-of-arrays and dumps json output into two files, one plain and one gzipped. It then reads those data back in via
File.read
orZlib::GzipReader
, additionally making a copy of the formerly-gzipped-data.Note that I'm not parsing every time -- I'm just getting a copy of the data (as seen in this snippet from the gist):
Oj doesn't show this same issue. I know benchmarking this stuff is treacherous, but the pattern seems pretty consistent.
My original data was just a file on disk as gzipped with
gzip
, so I don't think this has anything to do withZlib::GzipWriter
-- that's just to make it a self-contained bug reproduction. I'm assuming thatGzipReader
is hanging on to the string in some way that makes things nasty?-Bill-