wren-lang / wren

The Wren Programming Language. Wren is a small, fast, class-based concurrent scripting language.
http://wren.io
MIT License
6.86k stars 550 forks source link

Avoiding UB in bitwise shift operators #892

Open ChayimFriedman2 opened 3 years ago

ChayimFriedman2 commented 3 years ago

Currently, the code for the bitwise shift operators does not check the right value: https://github.com/wren-lang/wren/blob/81aff844159d7af4c49848184617e13f2ecb9d7b/src/vm/wren_core.c#L653-L667 Unfortunately, the C specification says that shifting by more than the number of the bits in the type is an Undefined Behavior. As a safe language, Wren cannot contain UB, so we must check that.

The problem is, what to do if we run out of bits count?

Both are valid, but I don't know which one to choose. This is the reason I'm opening this issue instead of just sending a PR.

Any opinion is appreciated.

benpop commented 3 years ago

Option 3: The Lua Model: any bit shift equal to or greater than the number of bits results in 0 - all of the bits getting shifted out. Ref: Lua manual 5.2, 5.3.

After reading the Lua docs, I have an additional question we should answer: What should we do if the operand is negative? Lua's behavior is to treat it as the opposite shift (1 >> -5 => 1 << 5, 6 << -2 => 6 >> 2).

PureFox48 commented 3 years ago

You may have noticed that in my recent attempt to document the bit-wise shift operators (#887) I dodged this question by stating what happened internally and not how this might manifest itself if the right operand were > 31 or negative.

I was originally thinking of saying something like this for the left shift operator (and mutatis mutandis for the right shift operator) but then thought better of it:

"If both numbers are initially positive integers and the magnitude of the second number is less than 32, this results in a 32-bit unsigned number where the bits of this are shifted other places to the left with the vacated places being filled with zeros. However, caution is needed in other cases as you may not then get the result you expect."

I agree with you that this is an unsatisfactory situation and that we ought to do something about it even if it will break code which relied on the particular behavior for a given platform/C compiler.

Apart from the options you've mentioned, a further one would be to just return NaN if other is not in the range [0, 31]. This is what other methods in the Num class tend to do when faced with an impossible or ambiguous situation.

ChayimFriedman2 commented 3 years ago

Apart from the options you've mentioned, a further one would be to just return NaN if other is not in the range [0, 31]. This is what other methods in the Num class tend to do when faced with an impossible or ambiguous situation.

Which methods? sqrt and such? Those returns NaN because this is what the mathematics says. Methods tend to return null if an operation cannot be performed (Num.fromString()).

PureFox48 commented 3 years ago

I was thinking in particular of the pow method which returns NaN if this is negative even though it's sometimes possible for such operations to be well-defined.

I'm not so keen on your option 1 as aborting the fiber is usually reserved for cases where other is not even a number.

Although superficially attractive, option 2 has the problem that it might silently give different answers for previously written code where the C compiler adopted a different approach to this particular kind of UB.

On the other hand, the use of NaN generally ensures (as a result of NaN propagation) that it's obvious if something is wrong or different.

mhermier commented 3 years ago

I have a troll question: You said "As a safe language, Wren cannot contain UB", where is it stated in the doc that it should be safe? Anyway, I think the sane way to handle this, is to go back to the basic and use division/multiplication by power of 2 and ensure that it is integer castable without loss, si overflow to right is 0 and to left is infinity.

ChayimFriedman2 commented 3 years ago

Anyway, I think the sane way to handle this, is to go back to the basic and use division/multiplication by power of 2 and ensure that it is integer castable without loss, si overflow to right is 0 and to left is infinity.

Overflow to right is 0 is understandable, but bitwise operators deal with integers and integers don't have infinity.

mhermier commented 3 years ago

Integers don't but there is no such thing as integers in wren anyway. So shifting doesn't have real sense in our case anyway, they are just provided as a convenience. Anyway this is just a case of how to deal with error, using a special value, or throwing.

ChayimFriedman2 commented 3 years ago

Personally, I like returning 0.

Let's collect all of the options listed above:

  1. Abort the fiber.
  2. Shift by modulus.
  3. Return 0, possibly reverse the shift when shifting by a negative number.
  4. Return null.
  5. Return Num.nan.
  6. Return Num.infinity.

@ruby0x1?

mhermier commented 3 years ago

Making it infinity, has the nice feature of propagating while not enforcing an error at another level in the computation, but the disaventage that it will might requires some masking prior shifting.

Making it 0 hides the problem, but is coherent with what CPU do.

Other solutions will highly probably result in throw due to computation usually involving a mask after shifting.

Allowing negative shift is another issue/feature.

ChayimFriedman2 commented 3 years ago

Allowing negative shift is another issue/feature.

I don't think so: the title is "Avoiding UB in bitwise shift operators" 😃.

Making it 0 hides the problem, but is coherent with what CPU do.

So what's the problem?

Aside from that, CPUs differ in what they will do in this situation.

ChayimFriedman2 commented 3 years ago

@mhermier

I have a troll question: You said "As a safe language, Wren cannot contain UB", where is it stated in the doc that it should be safe?

I found an answer! Not in the docs, but in the words of @munificent (https://github.com/wren-lang/wren/issues/533#issuecomment-384846004):

I don't think it's a good idea for Wren to be in the business of providing security guarantees beyond the basic "be safe from buffer overflows, etc.".

mhermier commented 3 years ago

There is no need to fix that UB then because it can't be the source of a VM crash.

ChayimFriedman2 commented 3 years ago

There is no need to fix that UB then because it can't be the source of a VM crash.

  1. It can. "Nasal demons".
  2. I don't remember that @munificent mentioned crashes.
mhermier commented 3 years ago

I miss typed what I meant. I understand the statement as "as long as the host does not die it is ok". So fact that the VM throws exceptions or does not compute correctly is ok has long has the host is not affected. This UB has few chances to makes the host die, otherwise it means that some parameters are not properly checked against overflow... So I think it is acceptable according to my understanding of @munificient words.

ChayimFriedman2 commented 3 years ago

I miss typed what I meant. I understand the statement as "as long as the host does not die it is ok". So fact that the VM throws exceptions or does not compute correctly is ok has long has the host is not affected. This UB has few chances to makes the host die, otherwise it means that some parameters are not properly checked against overflow... So I think it is acceptable according to my understanding of @munificient words.

"Security" is not just to not die. In fact, dying instead of UB is security.

The host is not mentioned at all (it is mentioned before an after, as the one that it its responsibility to protect against anything but "basic security guarantees".

And UB, just like UB, you can't predict. Will it affect the host? Maybe not. Even, probably not. But still, maybe yes. Either because of hardware or because of compiler or anything else. Relying on UB to not harm is the same as forbidding Wren in some computer/compiler (possibly unknown).

mhermier commented 3 years ago

Try to come with something, but I doubt the complexity it might add, justify what you try to fight against. I found odd that libc don't seems to provide an helper function to fight this, if this was so critical.

ChayimFriedman2 commented 3 years ago

I found odd that libc don't seems to provide an helper function to fight this, if this was so critical.

Usually you shift by a constant number, and so UB won't appear.

mhermier commented 3 years ago

Why would not it be the same case for wren... The probability the shift would come from a literal value is as high in wren as in other languages...

ChayimFriedman2 commented 3 years ago

DON'T TRUST USER INPUT.

mhermier commented 3 years ago

If we really want to go that route, maybe we should wrap that behavior in a function in wren_math.h to something like:

int wren_shift(int lhs, ssize_t rhs)
{
  if (rhs > 0)
  {
    return rhs < 32 ? lhs >> rhs : 0;
  }
  else
  {
    return rhs > -32 ? lhs << -rhs : 0;
  }
}

But dealing with that is a huge can of worm, and every behavior can be justified to some extend, and dealing with every type in existence, in particular signed types, is a new story.

PureFox48 commented 3 years ago

A question we don't seem to have asked so far in this issue is what happens at present when UB is encountered?

I think it's important to have some context here as it would be better not to break too much existing code if we can avoid it.

The answer for Ubuntu 20.04, gcc 9.3.0 (currently my only functioning system) is as follows:

var n = 1
var list = [n << 32, n << 33, n << -1, n << -2, n >> 32, n >> 33, n >> -1, n >> -2]
for (e in list) System.print(e) 

/*
n << 32 = 1
n << 33 = 2
n << -1 = 2147483648
n << -2 = 1073741824

n >> 32 = 1
n >> 33 = 0
n >> -1 = 0
n >> -2 = 0
*/

So it appears that left shift operations are wrapping modulo 32.

It would be interesting to see results for other platforms/compilers.

avivbeeri commented 3 years ago

A question we don't seem to have asked so far in this issue is what happens at present when UB is encountered?

The reason we aren't asking that question is that UB is by definition undefined, and so it can't be consistently relied upon for any application on any platform.

mhermier commented 3 years ago

Read some websites and it is CPU/ISA dependent. x86 follow that way, but it not true on other architecture, depends completely on the implementation of the assembly shift opcodes or what the compiler decided to use/emulate.

Some language let the issue unresolved, some requires to follow the predominant architecture x86, or any other strategy.

PureFox48 commented 3 years ago

Although I agree with everything that has just been said, I still think it would be useful to have some idea of current behavior as I suspect that there will be people who have relied on it without realizing that it was 'undefined' in the C spec.

mhermier commented 3 years ago

I don't think that in practice it does really matter. If you are using them you are probably doing some optimizations or use them in a controlled environment, like static variable definition. The chance you rely on the UB is low, and even if you do, it should be in very few localized place easy to fix else your code need refactoring because shifting to some magical numbers all around the code is really a code smell.

ChayimFriedman2 commented 3 years ago

I would like to extend: UB is not just unspecified result, it's also undefined behavior.

Although less likely in this case, compilers can optimize upon UB which makes the entire program wrong, or even doing something bad. A good series of articles in this context is https://blog.regehr.org/archives/213.

UB can also propagate up (not just down). A nice example of this is with Clang or GCC (not testes other compilers), which panic at the beginning of the program if it can prove that it would definitely execute some UB: https://godbolt.org/z/9eKozs.