Open laithsakka opened 1 year ago
This is because Java interprets the number of bits to shift by as num % size_of(type). i.e. because the input type is bigint, num >> 64 == num >> 0, num >> 65 == num >> 1, etc.
I agree it's a bug, thanks for raising. For anyone who wants to fix this, you do it in BitwiseFunctions.java
.
Hmm ok the documentation says:
For all three functions, the amount to shift is given by the bottom bits of the shift parameter, and higher bits of the shift parameter are ignored.
Not sure if we can get away with this behavior
Hmm ok the documentation says:
For all three functions, the amount to shift is given by the bottom bits of the shift parameter, and higher bits of the shift parameter are ignored.
Not sure if we can get away with this behavior
FYI, this documentation is for bitwise_right_shift_arithmetic and not for bitwise_arithmetic_shift_right those are different functions @kaikalur
i did not check bitwise_right_shift_arithmetic, it could have same issue as bitwise_arithmetic_shift_right function.
bitwise_right_shift_arithmetic
was added for compatibility with Trino. I think those can remain as is, as their intention was precise compatibility with Trino. We could consider moving such functions to an optional plugin instead of considering them to be built-in functions.
bitwise_arithmetic_shift_right
predates those functions and probably should be made to be consistent with bitwise_left_shift
and bitwise_logical_right_shift
which were added at the same time--the function argument is validated to be [0,64]. We should definitely update the documentation to make this much less confusing.
I see so to summarize: since i am trying to mimic this in Velox. two updates needed for bitwise_right_shift_arithmetic 1) if the input is not between 0, 64 throw error. 2) if the input is 64 then return 0 or -1 based on the input left most bit, instead of number >> shift.
Sorry, I made a mistake and looked at the bits
validation. The other functions I mentioned only validate that the shift
is non-negative.
We don't know who is relying on the present behavior of >64 bit shifts, but we do know it's likely wrong, and that pipeline disruptions are almost as bad as wrong results. So I propose:
1) Keep present behavior of validating that the shift
is non-negative.
2) Ensure that shift of >63 bits is clamped such that it's equivalent to 63 bits for arithmetic shift right, and 64 bits for logical shift right and shift left. We do not throw.
(More explicitly: arithmetic shift right returns -1 if shift is >=63 bits and the number is negative, 0 if shift is >=63 bits and the number is positive; logical shift right returns 0 if shift is >= 64 bits; shift left returns 0 if shift is >= 64 bits)
I believe this behavior is more intuitive than the current behavior, doesn't cause any new unexpected query failures, and fixes potential unexpected correctness issues.
Thoughts?
Sorry, I made a mistake and looked at the
bits
validation. The other functions I mentioned only validate that theshift
is non-negative.We don't know who is relying on the present behavior of >64 bit shifts, but we do know it's likely wrong, and that pipeline disruptions are almost as bad as wrong results. So I propose:
- Keep present behavior of validating that the
shift
is non-negative.- Ensure that shift of >63 bits is clamped such that it's equivalent to 63 bits for arithmetic shift right, and 64 bits for logical shift right and shift left. We do not throw. (More explicitly: arithmetic shift right returns -1 if shift is >=63 bits and the number is negative, 0 if shift is >=63 bits and the number is positive; logical shift right returns 0 if shift is >= 64 bits; shift left returns 0 if shift is >= 64 bits)
I believe this behavior is more intuitive than the current behavior, doesn't cause any new unexpected query failures, and fixes potential unexpected correctness issues.
Thoughts?
I think that make sense,
Sorry, I made a mistake and looked at the
bits
validation. The other functions I mentioned only validate that theshift
is non-negative.We don't know who is relying on the present behavior of >64 bit shifts, but we do know it's likely wrong, and that pipeline disruptions are almost as bad as wrong results. So I propose:
- Keep present behavior of validating that the
shift
is non-negative.- Ensure that shift of >63 bits is clamped such that it's equivalent to 63 bits for arithmetic shift right, and 64 bits for logical shift right and shift left. We do not throw. (More explicitly: arithmetic shift right returns -1 if shift is >=63 bits and the number is negative, 0 if shift is >=63 bits and the number is positive; logical shift right returns 0 if shift is >= 64 bits; shift left returns 0 if shift is >= 64 bits)
I believe this behavior is more intuitive than the current behavior, doesn't cause any new unexpected query failures, and fixes potential unexpected correctness issues.
Thoughts?
for (2) it could cause disruptions? are we ok with those?
I think this could be mitigated by a release note. CC: @kaikalur, what do you think?
I found a bug in presto bitwise_arithmetic_shift_right function when shifting negative big int to the right beyond 63 digits. -- -- Result -2, -1, -9223372036854775808 -- select bitwise_arithmetic_shift_right(cast ('-9223372036854775808' as bigint), cast (62 as bigint)) , -- bitwise_arithmetic_shift_right(cast ('-9223372036854775808' as bigint), cast (63 as bigint)),
-- bitwise_arithmetic_shift_right(cast ('-9223372036854775808' as bigint), cast (64 as bigint)) -- this should give -1 it give -9223372036854775808 should be -2, -1, -1 i tested it on int, tiny int it works fine the problem is just with bigint input https://fburl.com/daiquery/dn4id08d
similarly bitwise_arithmetic_shift_right(cast ('9223372036854775807' as bigint), cast (64 as bigint)) -- this should give 0 not 9223372036854775807