oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3.02k stars 185 forks source link

String#unpack with `Z` and `x` results in corrupted data #2659

Closed postmodern closed 2 years ago

postmodern commented 2 years ago

One of my specs which tests unpacking a repeating series of binary fields fails on truffleruby. I managed to narrow the bug down to using Z and x in a pack-string along with s and L after it. Changing the Z to an A or removing the x and the \x00 byte seems to work, implying there's some kind of interaction between the handling of Z and x.

Steps To Reproduce

"A\x00\xFF\xFF\xFF\xFF\xFF\xFF".unpack('Zxsl')

Expected Result

["A", -1, -1]

Actual Result

["A", -1, nil]

Versions

Additional Insights

If we add another x and another \x00 along with a L and \xFF\xFF\xFF\xFF we get this strange result:

"A\x00\xFF\xFF\xFF\xFF\xFF\xFF\x00\xFF\xFF\xFF\xFF".unpack('ZxslxL')
# => ["A", -1, 16777215, nil]

If we pack 16777215 as an unsigned 32bit integer (L) we get these bytes:

[16777215].pack('L')
# => "\xFF\xFF\xFF\x00"

This seems to imply there's an off-by-one error in the String#unpack logic caused by x.

postmodern commented 2 years ago

Just tested on truffleruby 22.1.0 and seeing the same behavior.

eregon commented 2 years ago

Thank you for the report.

@aardvark179 Can you take a look?

aardvark179 commented 2 years ago

I'll take a look.

eregon commented 2 years ago

It's quite weird .unpack('Z') doesn't consume the \0 on CRuby, even though it obviously reads it. Sounds very error-prone. It's easy to be compatible with CRuby for that, though.

aardvark179 commented 2 years ago

A fix for this has now been merged into master (b1dbba7fff2208a3eecd16eb653c086af8a19558).