Closed tenderlove closed 6 months ago
This does look promising. I'm curious to see how we perform and take a look at the stats!
IIRC we tried pr-zlib in TruffleRuby a long time ago. It was a lot slower than the C extension (IIRC).
https://github.com/djberg96/pr-zlib/blob/main/lib/pr/rbzlib.rb looks like a fairly direct translation from C to Ruby, so I guess it's obvious but this is not typical Ruby code. It's probably not particularly optimized either. I think that code is not really representative of Ruby code in general and optimizing it probably has little effect on production/real-world Ruby code (because it's unlikely to be used instead of the C extension). It's just my opinion, please do as you wish.
It might be quite interesting to run it, compare and get some stats though.
Fair enough. The fact that the gem has so few downloads also makes it hard to justify specifically optimizing this code.
@eregon if you can think of other Pure-Ruby gems that would make nice benchmarks, we're open to suggestions.
I've also been meaning to ask you: in terms of binary file I/O, reading/writing different integer types, what is your preferred method to do that in Ruby, and is this something you've optimized in TruffleRuby?
I've also been meaning to ask you: in terms of binary file I/O, reading/writing different integer types, what is your preferred method to do that in Ruby, and is this something you've optimized in TruffleRuby?
I think Array#pack & String#unpack are the typical way to deal with binary data in Ruby. These are optimized as their own mini-language in TruffleRuby, specifically we create small ASTs for them and partial evaluate them. Chris talked about this in this video. BTW TruffleRuby does the same for Kernel#sprintf, which is in that regard very similar to Array#pack (but producing textual instead of binary representation).
Looking at https://github.com/oracle/truffleruby/tree/master/bench:
chunky_png
uses quite a few pack&unpack. There is already https://github.com/Shopify/yjit-bench/blob/main/benchmarks/chunky-png/benchmark.rb but I'm not sure if that uses pack much or not, would be interesting to check.
This example using chunky_png is using setbyte
heavily from this profile. It'd be good to see if these setbyte
are also part of the yjit-bench chunky_png benchmark.image-demo
is a benchmark ported from a PyPy benchmark, it uses some .pack('C*')
. It's quite interesting how much a JIT can optimize it, but probably not very representative of real Ruby code.optcarrot
uses a few pack & unpack as well, but only if using actual audio or video. The SDL2 video driver uses FFI::Pointer#write_array_of_uint32
which seems also related to binary data (but that's probably implemented using a native extension for CRuby).psd.rb
does some pack & unpack, not sure how perf-sensitive it is but I'd think psd.rb is very related to binary data handling and would be a good benchmark addition.rack
seems to do some unpack("m*")
and unpack("C*")
$ git grep -P '\b(un)?pack\b' bench/
bench/chunky_png/chunky-decode-png-image-pass.rb:56:pixel = [PIXEL].pack("N")
bench/chunky_png/chunky-decode-png-image-pass.rb:57:scan_line = [ChunkyPNG::FILTER_NONE].pack("c") + (pixel * WIDTH)
bench/chunky_png/chunky_png/lib/chunky_png/canvas/data_url_exporting.rb:11: ['data:image/png;base64,', to_blob].pack('A*m').gsub(/\n/, '')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/data_url_importing.rb:14: from_blob($1.unpack('m').first)
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_decoding.rb:255: stream.unpack("@#{pos + 1}N#{width}")
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_decoding.rb:263: stream.unpack("@#{pos + 1}n#{width * 4}").each_slice(4) do |r, g, b, a|
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_decoding.rb:274: stream.unpack("@#{pos + 1}" << ('NX' * width)).map { |c| c | 0x000000ff }
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_decoding.rb:282: stream.unpack("@#{pos + 1}n#{width * 3}").each_slice(3) do |r, g, b|
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_decoding.rb:300: stream.unpack("@#{pos + 1}n#{width * 2}").each_slice(2) do |g, a|
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_decoding.rb:347: values = stream.unpack("@#{pos + 1}n#{width}")
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:237: pixels.pack('x' + ('NX' * width))
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:244: pixels.pack("xN#{width}")
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:262: chars.pack('xC*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:276: chars.pack('xC*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:287: chars.pack('xC*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:294: pixels.map { |p| encoding_palette.index(p) }.pack("xC#{width}")
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:312: chars.pack('xC*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:326: chars.pack('xC*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:337: chars.pack('xC*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:344: pixels.map { |p| p >> 8 }.pack("xC#{width}")
bench/chunky_png/chunky_png/lib/chunky_png/canvas/png_encoding.rb:351: pixels.pack("xn#{width}")
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_exporting.rb:15: pixels.pack('N*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_exporting.rb:26: pixels.pack('NX' * pixels.length)
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_exporting.rb:33: pixels.pack('C*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_exporting.rb:43: pixels.pack('nX' * pixels.length)
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_exporting.rb:54: pixels.pack('V*')
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_importing.rb:22: pixels = string.unpack(unpacker).map { |color| color | 0x000000ff }
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_importing.rb:39: self.new(width, height, string.unpack("N*"))
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_importing.rb:56: pixels = string.unpack("@1" << ('XV' * (width * height))).map { |color| color | 0x000000ff }
bench/chunky_png/chunky_png/lib/chunky_png/canvas/stream_importing.rb:73: self.new(width, height, string.unpack("V*"))
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:22: length, type = io.read(8).unpack('Na4')
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:24: crc = io.read(4).unpack('N').first
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:69: io << [content.length].pack('N') << type << content
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:70: io << [Zlib.crc32(content, Zlib.crc32(type))].pack('N')
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:134: fields = content.unpack('NNC5')
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:143: [width, height, depth, color, compression, filtering, interlace].pack('NNC5')
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:198: content.unpack('C*')
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:207: values = content.unpack('nnn').map { |c| ChunkyPNG::Canvas.send(:"decode_png_resample_#{bit_depth}bit_value", c) }
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:217: value = ChunkyPNG::Canvas.send(:"decode_png_resample_#{bit_depth}bit_value", content.unpack('n')[0])
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:258: keyword, value = content.unpack('Z*a*')
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:267: [keyword, value].pack('Z*a*')
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:287: keyword, compression, value = content.unpack('Z*Ca*')
bench/chunky_png/chunky_png/lib/chunky_png/chunk.rb:297: [keyword, ChunkyPNG::COMPRESSION_DEFAULT, Zlib::Deflate.deflate(value)].pack('Z*Ca*')
bench/chunky_png/chunky_png/lib/chunky_png/color.rb:141: rgb(*stream.unpack("@#{pos}C3"))
bench/chunky_png/chunky_png/lib/chunky_png/color.rb:151: rgba(*stream.unpack("@#{pos}C4"))
bench/chunky_png/chunky_png/lib/chunky_png/datastream.rb:14: SIGNATURE = ChunkyPNG.force_binary([137, 80, 78, 71, 13, 10, 26, 10].pack('C8'))
bench/chunky_png/chunky_png/lib/chunky_png/palette.rb:42: palatte_bytes = palette_chunk.content.unpack('C*')
bench/chunky_png/chunky_png/lib/chunky_png/palette.rb:44: alpha_channel = transparency_chunk.content.unpack('C*')
bench/chunky_png/chunky_png/lib/chunky_png/palette.rb:151: ChunkyPNG::Chunk::Transparency.new('tRNS', map { |c| ChunkyPNG::Color.a(c) }.pack('C*'))
bench/chunky_png/chunky_png/lib/chunky_png/palette.rb:172: ChunkyPNG::Chunk::Palette.new('PLTE', colors.pack('C*'))
bench/chunky_png/chunky_png/lib/chunky_png/rmagick.rb:39: image.import_pixels(0,0, canvas.width, canvas.height, 'RGBA', canvas.pixels.pack('N*'))
bench/image-demo/lib/io.rb:26: #@color_data = Array.new(img.width * img.height / 2, 127).pack("C*")
bench/image-demo/lib/noborder.rb:66: f.write @data.pack('C*')
bench/image-demo/lib/noborder.rb:97: f.write @data[(@width+1)...(-@width-1)].pack('C*')
bench/micro/array/pack.rb:11:benchmark 'core-array-pack-little-C' do
bench/micro/array/pack.rb:12: little_array_of_bytes.pack('C*')
bench/micro/array/pack.rb:17:benchmark 'core-array-pack-big-C' do
bench/micro/array/pack.rb:18: big_array_of_bytes.pack('C*')
bench/micro/string/equal.rb:13: Array.new(n, 'a'.ord).pack('C*')
bench/micro/string/index.rb:13: Array.new(n, 'a'.ord).pack('C*')
bench/optcarrot/lib/optcarrot/driver/ao_audio.rb:58: buff = output.pack(@pack_format)
bench/optcarrot/lib/optcarrot/driver/gif_video.rb:15: @f << header.pack("A*vvC*")
bench/optcarrot/lib/optcarrot/driver/gif_video.rb:18: @f << [0x21, 0xff, 0x0b, "NETSCAPE", "2.0", 0x03, 0x01, 0x00, 0x00].pack("C3A8A3CCvC")
bench/optcarrot/lib/optcarrot/driver/gif_video.rb:21: @header = [0x21, 0xf9, 0x04, 0x00, 1, 255, 0x00].pack("C4vCC")
bench/optcarrot/lib/optcarrot/driver/gif_video.rb:22: @header << [0x2c, 0, 0, WIDTH, HEIGHT, 0, 8].pack("Cv4C*")
bench/optcarrot/lib/optcarrot/driver/gif_video.rb:27: @f << [0x3b].pack("C")
bench/optcarrot/lib/optcarrot/driver/gif_video.rb:64: buff = [buff].pack("b*")
bench/optcarrot/lib/optcarrot/driver/gif_video.rb:66: buff = buff.gsub(/.{1,255}/m) { [$&.size].pack("C") + $& } + [0].pack("C")
bench/optcarrot/lib/optcarrot/driver/misc.rb:108: return width, height, pixels.write_bytes(dat.pack("V*"))
bench/optcarrot/lib/optcarrot/driver/png_video.rb:39: chunk("IHDR", [@width, @height, 8, 2, 0, 0, 0].pack("NNCCCCC")),
bench/optcarrot/lib/optcarrot/driver/png_video.rb:46: [data.bytesize, type, data, crc32(type + data)].pack("NA4A*N")
bench/optcarrot/lib/optcarrot/driver/png_video.rb:54: code = [0x78, 0x9c].pack("C2") # Zlib header (RFC 1950)
bench/optcarrot/lib/optcarrot/driver/png_video.rb:58: code << [data.empty? ? 1 : 0, s.size, ~s.size, *s].pack("CvvC*")
bench/optcarrot/lib/optcarrot/driver/png_video.rb:60: code << [b % ADLER_MOD, a % ADLER_MOD].pack("nn") # Adler-32 (RFC 1950)
bench/optcarrot/lib/optcarrot/driver/sdl2.rb:42: case pixels.read_bytes(4).unpack("C*")
bench/optcarrot/lib/optcarrot/driver/sdl2.rb:122: case pixels.read_bytes(2).unpack("C*")
bench/optcarrot/lib/optcarrot/driver/sdl2_audio.rb:38: buff = output.pack(@pack_format)
bench/optcarrot/lib/optcarrot/driver/sfml_audio.rb:24: @buff << output.pack("v*".freeze)
bench/optcarrot/lib/optcarrot/driver/wav_audio.rb:9: buff = @buff.pack(@pack_format)
bench/optcarrot/lib/optcarrot/driver/wav_audio.rb:13: ].pack("A4VA4A4VvvVVvvA4VA*")
bench/optcarrot/lib/optcarrot/nes.rb:61: puts "checksum: #{ @ppu.output_pixels.pack("C*").sum }" if @conf.print_video_checksum && @video.class == Video
bench/optcarrot/lib/optcarrot/rom.rb:16: sig, _, flags, comp, _, _, _, data_len, _, fn_len, ext_len = bin.slice!(0, 30).unpack("a4v5V3v2")
bench/optcarrot/lib/optcarrot/rom.rb:140: File.binwrite(sav, @wrk.pack("C*"))
bench/optcarrot/tools/shim.rb:77:unless [].respond_to?(:pack) && [33, 33].pack("C*") == "!!"
bench/optcarrot/tools/shim.rb:78: $stderr.puts "[shim] Array#pack"
bench/optcarrot/tools/shim.rb:80: alias pack_orig pack if [].respond_to?(:pack)
bench/optcarrot/tools/shim.rb:81: def pack(fmt)
bench/optcarrot/tools/shim.rb:146: if "".respond_to?(:unpack)
bench/optcarrot/tools/shim.rb:147: $stderr.puts "[shim] String#bytes (by using unpack)"
bench/optcarrot/tools/shim.rb:151: unpack("C*")
bench/optcarrot/tools/shim.rb:190: def unpack(fmt)
bench/psd.rb/psd.rb/lib/psd/file.rb:5: # All of the formats and their pack codes that we want to be able to convert into
bench/psd.rb/psd.rb/lib/psd/file.rb:44: read(info[:length]).unpack(info[:code])[0]
bench/psd.rb/psd.rb/lib/psd/file.rb:48: write [val].pack(info[:code])
bench/psd.rb/psd.rb/lib/psd/file.rb:55: read(1).unpack('c*')[0].to_f +
bench/psd.rb/psd.rb/lib/psd/file.rb:56: (read(3).unpack('B*')[0].to_i(2).to_f / (2 ** 24)).to_f # pre-decimal point
bench/psd.rb/psd.rb/lib/psd/file.rb:60: write [num.to_i].pack('C')
bench/psd.rb/psd.rb/lib/psd/file.rb:67: write [binary_numerator >> 16].pack('C')
bench/psd.rb/psd.rb/lib/psd/file.rb:68: write [binary_numerator >> 8].pack('C')
bench/psd.rb/psd.rb/lib/psd/file.rb:69: write [binary_numerator >> 0].pack('C')
bench/psd.rb/psd.rb/lib/psd/layer/blending_ranges.rb:46: outfile.write @blending_ranges[:grey][:source][:black].pack('CC')
bench/psd.rb/psd.rb/lib/psd/layer/blending_ranges.rb:47: outfile.write @blending_ranges[:grey][:source][:white].pack('CC')
bench/psd.rb/psd.rb/lib/psd/layer/blending_ranges.rb:48: outfile.write @blending_ranges[:grey][:dest][:black].pack('CC')
bench/psd.rb/psd.rb/lib/psd/layer/blending_ranges.rb:49: outfile.write @blending_ranges[:grey][:dest][:white].pack('CC')
bench/psd.rb/psd.rb/lib/psd/layer/blending_ranges.rb:52: outfile.write @blending_ranges[:channels][i][:source][:black].pack('CC')
bench/psd.rb/psd.rb/lib/psd/layer/blending_ranges.rb:53: outfile.write @blending_ranges[:channels][i][:source][:white].pack('CC')
bench/psd.rb/psd.rb/lib/psd/layer/blending_ranges.rb:54: outfile.write @blending_ranges[:channels][i][:dest][:black].pack('CC')
bench/psd.rb/psd.rb/lib/psd/layer/blending_ranges.rb:55: outfile.write @blending_ranges[:channels][i][:dest][:white].pack('CC')
bench/sinatra/bundle/gems/rack-2.2.3/lib/rack/lobster.rb:11: I8jyiTlhTcYXkekJAzTyYN6E08A+dk8voBkAVTJQ==".delete("\n ").unpack("m*")[0])
bench/sinatra/bundle/gems/rack-2.2.3/lib/rack/utils.rb:380: l = a.unpack("C*")
hexapdf
might also do quite a bit of binary data handling.
https://github.com/Shopify/yjit-bench/blob/main/benchmarks/hexapdf/benchmark.rb seems to write a PDF but not read one, it might be interesting to read and/or transform a PDF too for more binary data handling.
One more thought is I think it'd probably make sense to add most/all of the classic benchmarks at https://github.com/oracle/truffleruby/tree/master/bench/classic. Many of them are already in yjit-bench. They are not really representative of typical Ruby code but I think they stress pretty fundamental things (e.g. polymorphic calls, recursion) so optimizing them is likely to affect real workloads too.
aobench
is a small raytracer and it even renders it to a .ppm
file, using sprintf("%c", byte)
which is rather original but that's what it does (the original benchmark does printf
and just outputs the image on stdout).
The others are fairly well-known "classic" benchmarks like richards, deltablue and the shootout benchmarks.
There is also the AWFY benchmarks https://github.com/smarr/are-we-fast-yet and notably this branch which uses Ruby Array & Hash instead of custom data structures (and so is closer to typical Ruby code). The paper has a pretty in-depth analysis of what each benchmark does (notably Figure 3).
Here's a link to the project. I think this might make a good target for binary file manipulation benchmarks.
I'll work on writing a benchmark and send a PR.