nim-lang / bigints

BigInts for Nim
MIT License
124 stars 32 forks source link

Bitwise operators for negative numbers #51

Closed konsumlamm closed 2 years ago

konsumlamm commented 2 years ago

Currently, the bitwise operators (and, or, xor) only work for non-negative numbers. I suggest also supporting negative numbers, treating them as if they were represented using 2's complement (the standard representation for signed integers). This would also allow implementing not x, by simply doing -x - 1.

dlesnoff commented 2 years ago

This would also allow implementing not x, by simply doing -x - 1.

This sentence is confusing, we do not need and, or nor xor between negative numbers to implement not x.

I suggest also supporting negative numbers, treating them as if they were represented using 2's complement

I am not sure how to interpret the BigInt 0xFFFF. Is it -1 or is it 2^32 ? 2's complement representation depends on a limit for the type to be represented, how do we fix the limit ? By the length of the smallest of the bitwise operands, or by the length of the greatest of the operands ? How do we take the sign of the BigInt into account ? The issue does not explains this.

I suggest to use the length of the greatest of the operands : Let us say, we have one BigInt a of length 5 and one BigInt b of length 3. Then I could try to find the 2's complement representation using 32*5 bits for each with one sign bit (even if the BigInt b is of length 3, I can always add 2 limbs). There is a problem if the MSB of a is 1, we need one more bit to have a 2's complement representation.

I'll try to give another example with bits, that will be clearer. Let us say, BigInts are represented this time with limbs of 8 bits plus a sign flag. If I want to compute +0x9 and -0x3, I would need for each of them only one bit to represent them as BigInt, but if I check the most significant bit of the first one 0x9, since it is set to 1, I can not find a 2's complement representation for him with only 8 bits. So I will convert +0x9 to 0x09 (using two limbs) and -0x3 to 0xFC (using two limbs). The and is now computable and gives 0x8. So actually there are 4 cases :

We need to limit the cost of the algorithm, and not make a copy (for padding) of a small BigInt (-1 for example) into a big BigInt when the other input is a BigInt with 250 limbs or more.

konsumlamm commented 2 years ago

This would also allow implementing not x, by simply doing -x - 1.

This sentence is confusing, we do not need and, or nor xor between negative numbers to implement not x.

"This" was referring to "treating negative numbers as if they were represented using 2's complement". I admit that my wording can be confusing, and, or and xor are of course not needed to implement not

I am not sure how to interpret the BigInt 0xFFFF. Is it -1 or is it 2^32 ?

It would be 2^32-1.

2's complement representation depends on a limit for the type to be represented, how do we fix the limit ? By the length of the smallest of the bitwise operands, or by the length of the greatest of the operands ? How do we take the sign of the BigInt into account ? The issue does not explains this.

We don't fix the limit. My suggestion is not that we represent the BigInts in 2's complement, we just act as if they were. They would still be represented as a sign and the absolute value, as is the case today (this is also the representation that most other implementations choose afaik). A negative number would conceptually have an infinite number of leading 1s.

Let us say, BigInts are represented this time with limbs of 8 bits plus a sign flag. If I want to compute +0x9 and -0x3, I would need for each of them only one bit to represent them as BigInt, but if I check the most significant bit of the first one 0x9, since it is set to 1, I can not find a 2's complement representation for him with only 8 bits.

9 is 00001001 in binary using 8 bits, so the MSB would be 0 and not 1. I assume you meant 4 bits? Either way, why would you need to find its 2's complement?

So I will convert +0x9 to 0x09 (using two limbs) and -0x3 to 0xFC (using two limbs). The and is now computable and gives 0x8. So actually there are 4 cases :

My suggestion doesn't require computing any 2's complements. On first thought, it could be implemented by subtracting 1 (to get not x) for negative numbers (that would at most require one additional limb, although that case could possibly be optimized) and treating the limbs differently. Using your example (assuming 4 bit limbs), this would work as follows:

So the result is 9, which is also what you get with normal ints.

We need to limit the cost of the algorithm, and not make a copy (for padding) of a small BigInt (-1 for example) into a big BigInt when the other input is a BigInt with 250 limbs or more.

That wouldn't be necessary. For additional limbs, negative numbers would be treated as having all 1s.

demotomohiro commented 2 years ago

Why is bitwise operators for negative numbers needed? Is there any use case?

And why treating them as if they were represented using 2's complement? Is it just because the standard representation for signed integers in fixed size int? Just do bitwise operation as is doesn't work?

konsumlamm commented 2 years ago

Why is bitwise operators for negative numbers needed? Is there any use case?

Any use case of normal bitwise operators that involves not, really, e.g. you can do x and (not 0b01010101'bi) to only keep the 0th, 2nd, 4th and 6th bit of x.

And why treating them as if they were represented using 2's complement? Is it just because the standard representation for signed integers in fixed size int?

Because it's how fixed size integers are represented and it's how at least I would expect it to work. This is also what all the other bigint libraries (that I'm aware of) do.

Just do bitwise operation as is doesn't work?

Sorry, I don't know what you mean by this.

demotomohiro commented 2 years ago

Why is bitwise operators for negative numbers needed? Is there any use case?

Any use case of normal bitwise operators that involves not, really, e.g. you can do x and (not 0b01010101'bi) to only keep the 0th, 2nd, 4th and 6th bit of x.

I read https://github.com/nim-lang/bigints/pull/111, and it seems treating negative numbers as if they were represented using 2's complement requires allocating temporary BigInt and copying arguments. This andNot function can do x and (not 0b01010101'bi) without allocating temporary BitInt.

func bitwiseAndNot(a: var BigInt, b, c: BigInt) =
  a.limbs.setLen(b.limbs.len)
  let m = min(b.limbs.len, c.limbs.len)
  for i in 0 ..< m:
    a.limbs[i] = b.limbs[i] and (not c.limbs[i])

  if b.limbs.len > c.limbs.len:
    for i in m ..< b.limbs.len:
      a.limbs[i] = b.limbs[i]

func andNot*(a, b: BigInt): BigInt =
  ## Bitwise `a and (not b)`
  assert (not a.isNegative) and (not b.isNegative)
  bitwiseAndNot(result, a, b)
  normalize(result)

Just do bitwise operation as is doesn't work?

Sorry, I don't know what you mean by this.

I thought about doing bitwise operation to negative numbers in the same way as positive numbers. But it doesn't work for your case.

dlesnoff commented 2 years ago

we are not going to write every pair of operators just to avoid temporary allocation : andNot, notAnd, andXor, etc ... This shows that it is possible to avoid allocation for this special case, but can you do it generically ?

demotomohiro commented 2 years ago

I think using ones' complement instead of two's complement can avoid allocation. Ones' complement just inverts bits when it is negative. When negative number is not directly used for bit operation, result of ones' complement and two's complement are same. But when negative number is used, result of them is not same.

For example:

import bigints

let a = 0b001111'bi
echo a and not (0b010101'bi)
echo a and not (0b101010'bi)
echo a and -1'bi
echo a and -2'bi
echo a and -3'bi

Output when two's complement is used:

10
5
15
14
13

Output when ones' complement is used:

10
5
14
13
12

But GMP uses two's complement for bit operations. Ones' complement might be counterintuitive for users as almost all modern CPU uses two's complement.

konsumlamm commented 2 years ago

I think using ones' complement instead of two's complement can avoid allocation.

Sure, you could, but that would be inconsistent with how normal integers work. Another problem would be that "all 1s" is then "-0", which is currently handled as being equal to 0 (which is "all 0s").

But GMP uses two's complement for bit operations. Ones' complement might be counterintuitive for users as almost all modern CPU uses two's complement.

Not only GMP, but also the standard bigint implementations for Rust, Java, Python, D, Go, etc.

dlesnoff commented 2 years ago

Please make a distinction between the result we want and the implementation we use. According to the gmp manual : « functions behave as if twos complement arithmetic were used (although sign-magnitude is the actual implementation). » See https://gmplib.org/manual/Integer-Logic-and-Bit-Fiddling https://gmplib.org/repo/gmp-6.2/file/tip/mpz/xor.c https://gmplib.org/repo/gmp-6.2/file/tip/mpz/ior.c https://gmplib.org/repo/gmp-6.2/file/tip/mpz/neg.c You can especially look at the comments of the and function : https://gmplib.org/repo/gmp-6.2/file/tip/mpz/and.c

If we can use another representation to avoid allocation while outputting the correct result, at least for certain cases, why not ? If not, let us just validate this PR.