ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.73k stars 2.48k forks source link

proposal: do not pack SIMD vectors #18652

Open Snektron opened 7 months ago

Snektron commented 7 months ago

Currently, @Vector(N, T) is packed together. For example:

export fn square() u32 {
    return @bitSizeOf(@Vector(11, u3));
}

returns 33. This seems counter intuitive to me for several reasons:

In my opinion, vectors should essentially be a bag of scalars that you want to perform the same operation on, and not provide any layout guarantees at all. This would enable compilers to lower @Vector(11, u3) to @Vector(11, u8), and omit these expensive shifts (and a whole lot of headaches).

An important edge case here are @Vector(N, bool) and @Vector(N, u1). I suspect the reason why the above are packed at all, is to provide the guarantee that those are backed by an integer (and that all operations on it are bitwise). This makes sense to me, and I don't think that we should remove that. I see three main paths forward:

In all cases, I think we only need to remove the capability to bitcast between vectors and integers.

andrewrk commented 7 months ago

@bitSizeOf reports how many bits of logical information there are. For example, i9 returns 9 for bit size of, while it actually takes up 2 bytes of actual memory (16 bits) in its representation. I can still make sense of your initial point however if we look at @sizeOf instead and make the same observation.

In my opinion, vectors should essentially be a bag of scalars that you want to perform the same operation on, and not provide any layout guarantees at all. This would enable compilers to lower @Vector(11, u3) to @Vector(11, u8), and omit these expensive shifts (and a whole lot of headaches).

This is status quo.

Add no such guarantee at all (leave it up to the backend), and provide utilities (functions or built-ins) to between uN and vectors.

Again this is status quo.

In all cases, I think we only need to remove the capability to bitcast between vectors and integers.

But that's the utility to convert between uN and vectors that you mentioned above.

Snektron commented 7 months ago

Alright, I guess I misunderstood the meaning of bitcast in this context. This doesn't change the fact that LLVM still reasons about it in a packed way, which I still believe should be corrected.

andrewrk commented 7 months ago

I agree, that should be corrected. Happy to make any clarifications to the lang ref or type up some spec text if you need it to get unblocked. Let me know how I can help. But it sounds like everything you want to do in the backend is already legal.

mlugg commented 7 months ago

Status quo vectors are a little bit of a mess - I discussed this with @andrewrk and @jacobly0 in a compiler meeting a while back. Some parts of the language (bitcasting) assume they are packed, and some (vector index in bit-pointers) do not.

If I remember correctly, Jacob proposed behavior something like this:

The only compiler change that would be needed here to match the language spec is to change Type.hasWellDefinedLayout, which currently just returns true for vectors.

Another problematic thing with vectors is the "vector index" field on bit-pointers. I don't love it in general - I think it would make more sense to return bit-pointers, even if per the above rules this makes the exact type returned backend-dependent (related: #16271) - but the bigger problem is that the vector index is permitted to be runtime-known (represented by writing it as a ?), which completely breaks how bit-pointers are supposed to work, and can cause completely incorrect behavior even today if you do &unaligned_vector[runtime_index]. (Fixing this basically would be #16271.)

p7r0x7 commented 7 months ago

I've been thinking about this a lot, as my upcoming VP9 encoder project heavily relies on SIMD and @Vector(). I would love to use arbitrary-bit-wide integers efficiently (because the compiler "reduces"/"promotes" them to 2^x bit sizes).