rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.89k stars 1.56k forks source link

Arrays should impl the bitwise operations #2773

Open canndrew opened 4 years ago

canndrew commented 4 years ago

I have a couple [u8; 8]s that I'd like to xor with each other. Unfortunately this doesn't work:

let x = [0u8; 8];
let y = [1u8; 8];
let z = x ^ y;

I think it should, since this has pretty clear semantics. More generally I think it would make sense to have an impl:

impl<const N: usize, A, B> BitXor<[B; N]> for [A; N]
where
    A: BitXor<B>,
{
    type Output = [<A as BitXor<B>>::Output; N];

    fn bitxor(self, rhs: [B; N]) -> Self::Output {
        ...
    }
}

And similar for BitAnd and BitOr.

ExpHP commented 4 years ago

Eh. One should then ask, "why bitwise ops but not + and *?" And I really don't want to see * implemented on arrays, as arrays are a common choice for mathematical vectors (where elementwise multiplication is a rare and unusual operation).

Arrays seldom show up in public APIs in Rust. I'd imagine that, most likely, if you are using arrays somewhere, you can easily substitute them out with wrapper types that implement the desired operations.

burdges commented 4 years ago

It's maybe reasonable because bitwise operations have clear semantics, while even primitive types make + confusing, and * gets even worse.

hellow554 commented 4 years ago

I think this is something a crate should offer, not the standard library, because as already said, mathematical operations are easy to misunderstand that they would really do.

canndrew commented 4 years ago

Yeah, I wouldn't advocate for + or * to be implemented on arrays - would it work element-wise? Would the results overflow to the next element (treating the whole array as one giant int)? Would multiplication be a dot product? Would it return a matrix of results? There's no single, obvious semantics here.

For bitwise operations though there is a single obvious semantics. Bitwise operations already work element-wise (which is why they're implemented for types like HashSet and BTreeSet). I can't imagine anyone getting confused by the semantics of this or expecting a different semantics.

canndrew commented 4 years ago

Or to put it another way: the bitwise operations aren't really mathematical operations. They're operations on collections of bools of the same shape.

burdges commented 4 years ago

Are there any bugs likely to arise from this when refactoring from a primitive to an array of primitives? I do not see any right now, but maybe someone else does?

scottmcm commented 4 years ago

If we were to have map and zip and such over arrays in the standard library, would that be sufficient for this? Then you could let z = x.zip_with(y, BitXor::bitxor);, and similar.

burdges commented 4 years ago

I'd think so..

impl<X, const N: usize> [X; N] {
    pub fn map<Y,F>(self, f: F) -> [Y; N]
    where F: FnMut(X) -> Y;

    pub fn zip<Y,F>(self, ys: [Y;N]) -> [(X,Y); N];

    pub fn zip_with<Y,Z,F>(self, f: F) -> [Z; N]
    where F: FnMut(X,Y) -> Z;
}

We'd maybe only want new methods of the form -> [??, N] right? I'm wondering if for works better than any in place methods, ala

impl<X, const N: usize> [X; N] {
    pub fn assign<F>(&mut self, ys: &[Y; N], f: F)
    where F: FnMut(&mut X, &Y);

    pub fn assign_mut<F>(&mut self, ys: &mut [Y; N], f: F)
    where F: FnMut(&mut X, &mut Y);
}
ExpHP commented 4 years ago

There should also be try_map.

(yes, this can be implemented for arrays, unlike for iterators, because the array operation is strict)

sivizius commented 4 years ago

On modern systems, e.g. amd64 with SSE

let x = [0u8; 16];
let y = [1u8; 16];
let z = x ^ y;

could be optimised as a single instruction, while implementing it by hand produces some overhead. IMHO a default-implementation of operands like + - & | ^ is a good idea, even if a generic implementation is not that sophisticated. Allowing some syntactic sugar for saturating arithmetic might be good too.

On the other hand multiplications on vectors/matrices in mathematics is different to just multiplying each element of both input-slices together which might result in confusion, but I guess this is the wrong RFC for suggesting new mathematical operands like cross and dot products. (^.^ )

But there is a generic but unstable SIMD-module, perhaps this might be useful someday:

let x = u8x16::splat(23u8);
let y = u8x16::splat(42u8);
let z = [0u8; 16];
(x ^ y).store_aligned(&mut z);
// or, if you want to do more of this stuff with z anyway
let z = x ^ y;

Using this SIMD-instructions a dedicated lib could provide this feature in an elegant way.

tesuji commented 4 years ago

Perhaps implement std::ops::BitXor for array of integers with xor methods explicitly.

ickk commented 3 years ago

Sorry to revive a seemingly dead rfc, but I hit this and it bothers me.

Frankly bit-wise operations should only ever operate bit wise on each bit corresponding to its respective bit; You don't need to worry about confusion regarding element-wise/cross-product/tensor-product, it's already in the name. bit-wise.

I'm not sure what relevance all the discussion about mathematics has anyway; note that when it comes to the bit-wise operations, |, ^, !, with Rust none of them do what common mathematics notation would dictate,

In fact even in the domain of boolean algebra a ^ looks far more like the symbol ∧ (AND) than ⊻ or ⩒ or ⊕ (XOR).

konsumlamm commented 2 years ago

@canndrew

For bitwise operations though there is a single obvious semantics. Bitwise operations already work element-wise (which is why they're implemented for types like HashSet and BTreeSet). I can't imagine anyone getting confused by the semantics of this or expecting a different semantics.

The bitwise operations for HashSet and BTreeSet do not apply the operations element-wise, they're used for intersection, union and symmetric difference. So, ironically, you got confused by the semantics of this yourself...

canndrew commented 2 years ago

@konsumlamm: Intersection, union and symmetric difference are &, | and ^ respectively on the bools representing whether an element of the type is included in the set.

A more formal way to think about this is that a HashSet<T> is a total map of the form T -> bool. By operating "element-wise" the bitwise operations operate on those bools. By analogy, a [T; N] is a total map of the form Fin(N) -> T (where Fin(N) is the type of numbers less than N). So by operating "element-wise" the bitwise operations would operate on those Ts.