more math friendly definition of shift operators

ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.

https://ziglang.org

MIT License

34.33k stars 2.51k forks source link

more math friendly definition of shift operators #7605

Open andrewrk opened 3 years ago

andrewrk commented 3 years ago

Currently, if you shift with a comptime-known RHS, everything is easy peasy lemonboy squeezy:

const std = @import("std");

test "comptime known rhs shift" {
    var x: u32 = 0b11110000;
    x >>= 2;
    std.testing.expect(x == 0b00111100);
}

However if the RHS is runtime known, it requires an often-awkward cast:

const std = @import("std");

test "runtime known rhs shift" {
    var x: u32 = 0b11110000;
    var y: u32 = 2;
    x >>= y;
    std.testing.expect(x == 0b00111100);
}

./test.zig:6:11: error: expected type 'u5', found 'u32'
    x >>= y;
          ^

There are 2 choices to resolve the situation:

Do more math to support any value for the RHS. This is what std.math.shr and std.math.shl do.
Assert that the RHS is in range with an @intCast.

const std = @import("std");

test "do more math to support any value" {
    var x: u32 = 0b11110000;
    var y: u32 = 2;
    x = if (std.math.cast(u5, y)) |rhs| (x >> rhs) else |_| 0;
    std.testing.expect(x == 0b00111100);
}

test "same thing but use std.math.shr" {
    var x: u32 = 0b11110000;
    var y: u32 = 2;
    x = std.math.shr(u32, x, y);
    std.testing.expect(x == 0b00111100);
}

test "assert the int is in range" {
    var x: u32 = 0b11110000;
    var y: u32 = 2;
    x >>= @intCast(u5, y);
    std.testing.expect(x == 0b00111100);
}

This proposal is to modify the >> and << operators to match what we have in std.math.shr and std.math.shl. This allows the RHS to be any integer type, even signed, and there is no possibility of illegal behavior; the entire range of all integers is well defined.

The reason for status quo is performance, to force people to think about the tradeoff when choosing between these two strategies. This proposal argues that this mathematical definition of bit shifting is more sensible. For code that wants to avoid the extra possible instructions arising from making this definition of shifting work on each platform, there are some pretty reasonable ways to accomplish this, so much so that they will often happen by accident:

If the RHS is comptime known, the operations can lower to better machine instructions.
If the RHS is a log2 int type - e.g. if you do the @intCast above - the operations can lower to better machine instructions.

The interesting thing here is that both definitions of bit shifting operations can be implemented in terms of the other in the standard library, so the question is, which one should we be pushing on users with the convenient shifting syntax? This proposal answers firmly with "the mathematically clean one". I do believe this matches other precedents, such as how @intCast works by preserving the mathematical integer value.

Related proposals:

Circular shift operator #7183
smarter semantics for right shift by comptime_int #1544
Explicit Shift Operators #5220

LemonBoy commented 3 years ago

This proposal is to modify the >> and << operators to match what we have in std.math.shr and std.math.shl

This proposal gives up a lot of safety for absolutely no gain. The explicit @intCast caught quite a few bugs in my code (eg. if you're trying to shift left a u64 by 65535 you have a big problem somewhere), under this proposal if the shift amount is greater than the type width it is silently truncated to zero. Good luck debugging that.

Making the shift direction change depending on the operand sign is another crazy idea, don't you think having a >> shift bits to the left is a bit unexpected?

And performance-wise the extra checks aren't exactly free.

AssortedFantasy commented 3 years ago

Making the @intCast() implicit would be nice, it's not that common, but if you are writing code which allows a number of integer widths, trying to get a ceiling log base 2 is kinda a PITA.

Besides, the @intCast(log-base-2-of-bit-width) literally covers the full range of allowed values (plus a few extra if the other side isn't a power of two). So assuming you are doing a << something(b) and something(b) is sane and produces a value that is allowed in the shift. Then @intCast(xx, something(b)) should be an idempotent operation. And if its not then the shift was going to error out anyway.

Otherwise I think the generic shift operations should be fairly "stupid" and fast. I don't think they should be clever and try to fix things.

~~Also @LemonBoy I don't believe @intCast() truncates. Its runtime checked to see if it would underflow or overflow.~~ Edit: I misread some stuff.

If you are compiling in a mode with runtime safety then:

The check has to happen anyway
You don't accomplish anything by doing thing a cast yourself aside from possibly restricting the right hand side to an even more restrictive class of integers.

The check of course disappears if you compile in a mode with no safety checking. But if you are doing that then you are throwing your math safety checking anyway so whatever.

The only downside I see of removing the explicit intCast is that it forces you to physically write out in code that a cast has to happen. And if for some reason you don't test your code properly and immediately jump to running stuff in release-fast you might miss that an error could have happened. But of course you would run into the same issue with any other typical math operation.

RogierBrussee commented 3 years ago

The unease about the definition of << (and >>) stems from the schizophrenic interpretation of status quo integers as simultaneous arithmetic integers, bit strings and modular Integers, which give different answers as to what is the "right" definition.

TL;DR There is no mathematically "better" definition of the shift operators. Different interpretations of what a bit string means give different answers. Restricting the shifts to operations on bit strings only allows for a definition without undefined behaviour that is easy on modern hardware. For the common use of shifts as multiplication by a power of 2 one should introduce a power operation and prefer an explicit multiplication by a power of 2 over a shift (using ^ for "to the power").

The hardware: In x86, AArch, Mips, PPC and RiscV and probably others but not ARM32, (which does something weird: https://david.wragg.org/blog/2012/11/shift-instructions.html) the right shift is defined such that e.g. for a: u32, n:u64,
a << n == as(u32, as(u64, a) << (n & 31)) i.e. e.g. for RISCV the instruction

SLLW rd rs1 rs2 // rd = as(u32, rs1 << (rs2 &31))

is defined such that the shift amount n (in register rs2) only depends on n mod 32 so there is no problem with n (rs2) being too large, and it will happily ignore bits 32 to 63 (which in RV are sign extended from bit 31)

The "right" definition of shift. What is the "right" interpretation of the shift depends on how one wants to interpret the bits in an integer, and the problem is, I think, the schizophrenic possibility to interpret the bits in an u32 or i32 as a non negative integer , a signed integer (which Zig distinguishes) but also as a bit string and as a integer mod 2^32 aka doing 2-complement "wrapping" integer operations which Zig does not distinguish. In #7512 I proposed to rename the current "promiscuous" integers as c_u32, c_i32 or (c_uint32, c_int32) and use u32 only as non negative integers (without +%, -%, % and without bit operations such as &, << and >>) and introduce modular integers m32 (with arithmetic ops +%, -%, % but without bit operations including << and >>) and b32 (with only bit operations). For the sake of argument assume that we have done these redefinitions but the point I want to make applies equally well to status quo but with different usage of the integer type. Also I suggested having ^ as an exponentiation operator as in #8196 and by extension ^% for modular exponentiation with an operator precedence above (avoiding the bug that a 2^3 + b is inadvertedly written a << 3 + b which means a << ( 3 + b)). We then have

for a: u32, n:u32 With the redefinition of the types, status quo a << n would be written a * 2^n. If n >=32 this is guaranteed to overflow 2^n as a u32, (which is undefined behaviour) but multiplication with a can produce a further overflow (which is undefined behaviour). If no overflow occurs, status quo Zig or a hardware instruction like SLLW does "the right thing".
for a: i32, n:u32 With the redefinition of the types, status quo a <<n would be written a 2^n. If n>=31 (sic!) this is guaranteed to overflow 2^n as an i32 (which is undefined behaviour) even though status quo ((-1) << 31) does not overflow . Further overflow (which is undefined behaviour) can occur in the multiplication including overflowing due to sign change. If no overflow occurs, status quo Zig or an instruction like SLLW does "the right thing" and will have undefined but expected behaviour for (-1) 2^n with n = 31 giving MININT
for a:m32, n:u32 If we have a modular integer, status quo a <<n , would be written a *% 2^%n . Modular arithmetic never overflows and has no undefined behaviour: it just does its thing and wraps. However, if n>=32 then 2^%n = as(m32, 0) . Thus represented by 2 complement, modulo multiplication by 2^%n behaves like math.shl.
for a:b32, n:u32 If we have a bitstring data type, a << n would be well defined and mean what it does in status quo for 0<= n <=31. When shifting is considered as a bit string operator we can define it in whatever way we like for n >= 32 and following the lead of modern hardware and defining a << n == a << (n % 32) seems as good a choice as any and avoids undefined behaviour. Of course a shift that "clean sweeps" a bit string for n>=32 (i.e. math.shl) is also useful and also avoids undefined behaviour on much (but not all) hardware. It is not mathematically "better" however: just different.

Edit: there is a sense in which a "clean sweep" shift (i.e. math.shl) is better mathematically even for bit vectors: if x << n is the same as the "clean sweep" math.shl(x, n) then it has the useful algebraic property that

(x << n ) << m == x << (n + m)

matu3ba commented 3 years ago

@RogierBrussee The number representation, shift operations and modulus are quite directly mapping to hardware instructions. If you use an exponent of 2, this is not anymore directly and explicit readable.

For example, you could have an exponent that 1.is a power of two or 2.is not a power of two. So this would require an extra lookup of the potential exponent, which is against the zig zen:

Communicate intent precisely.
Favor reading code over writing code

So one does want the common+efficient hardware instructions be explicit from the code, because zig is a system programming language (doing stuff efficient).

RogierBrussee commented 3 years ago

@matu3ba I don't really understand what you mean.

If your intent is to use a as a number and express multiplication by a power of 2, then the expression a 2^n is visibly multiplication by a power of 2 (not a power of 3 nor any other number). IMHO it communicates intent rather better than a << n , even though the programmer is expected (but not required) to know that the compiler will optimise it to a shift. Thus the expression a 2^n is in no way slower than doing the compiler trick of using the shift instruction yourself at least in optimised builds, you don't risk risking falling into the trap of the low operator precedence of <<, .and the vagaries of undefined behaviour are for the compiler to sort out.

If on the other hand the intent is to manipulate a as a string of bits which should be shifted n places, then there is a: b32 and the expression a << n is entirely appropriate, well defined, and without undefined behaviour.

Finally, if you really, really feel there is value in mingling arithmetic and bit operations on the same datatype (e.g for porting code from C) I proposed c_uint32 for exactly that.

matu3ba commented 3 years ago

@RogierBrussee

Integer number representation on ALUs are in binary format and the shift operation on an ALU takes exactly 1 CPU cycle (most efficient operation). In contrast to that is the a base different than 2 much less efficient, so one wants to make a visual distinguishment on that.
The UB handling is a good point, though that is what libary functions are for.
The number representation is identical to u32, so the point remaining is convenience (to differentiate between "I dont want this type to use shifts(<<)" and "go fast". The same effect could be however reached, if one makes RHS expressions being adjusted by =SAFETYOPTION as I suggested here.
At some point you want to mix both convenience types and you get the same problem again with yet more complexity.

zenMaya commented 2 years ago

I would like to add my 3 cents to this discussion. I agree with @RogierBrussee numbers and bits are a different thing.

Numbers are for doing math (adding, subtracting, representing a value)
Bit fields only represent bits in order
Numbers don't have to use the bit representation tthat we use now
Just take a look at how differently we can represent signed integers
For a programmer, it doesn't matter how they are represented in memory (if the number fits), the math must be correct
>> and << don't represent a math operation, they represent a bit operation that in our int format works like *2 or /2, but only for unsigned integers, it's technically an optimisation trick
It's the compilers job to optimize math operations (for example *2 gets converted to << 1), but always keep the operation safe
We mix a very different purposes of the 32 bits, we use them as numbers and allow number operations and we allow bit operations, which are totally a different thing
In my opinion, zig does very good with explicit casting of types, and it would do even better, if it didn't mix bit fields and numbers
you could always do @bitCast(u32, bitfield32), but if the platform doesn't do integers the expected way, you get a wrong number, but that pretty normal, since you do a bitCast.
Also one thing is, that in reality there are no types in the cpu architecture, all is just bits getting a representation, all operations can be done on all variables (you can do ADD float float) but the compiler ensures you do only the operations that work with your representation of the bits

bnprks commented 2 years ago

I just came across this as a new user trying out zig for Advent of Code puzzles. I found it extremely un-ergonomic to have to cast to a non-standard integer bitwidth in order to perform a shift. My "not overthinking it" proposal would be to allow shifting a u32 by another u32 and have it be undefined behavior (i.e. runtime panic by default) to shift by a value greater than the bitwidth.

I'd note that this is already effectively the behavior for non-power-of-two bitwidths on the left hand side:

pub fn main() anyerror!void {
    var a: u10 = 12;
    var b: u4 = 11;
    var c = a << b;
}

This code results in a runtime panic: thread 1120555 panic: shift amount is greater than the type size I expect that most people will perform an unsafe @intCast prior to doing a shift -- I'm not sure why it's better to have to add an @intCast to generate the panic rather than just having the shift operator generate the panic.

matu3ba commented 2 years ago

I'm not sure why it's better to have to add an @intCast to generate the panic rather than just having the shift operator generate the panic.

If you are unsure about stuff, ask in one of the community channels.

matu3ba commented 2 years ago

For a programmer, it doesn't matter how they are represented in memory

If you want to do stuff optimal on the hardware, you need to understand how things are represented and operated on in memory. The programmer knows always more about how memory should be used than the optimizing compiler. Typically the programmer cant reason about the program as a whole (which also goes for math), so the compiler helps with that to optimize stuff like math.

but always keep the operation safe

This is not possible. You cant do formula replacement to optimally code, if intermediate results overflowing is well defined. Read ralphs take on why compilers need UB.

it would do even better, if it didn't mix bit fields and numbers

Can you please elaborate how this justifies the added complexity in relation to my arguments above?

The number representation is identical to u32, so the point remaining is convenience (to differentiate between "I dont want this type to use shifts(<<)" and "go fast". The same effect could be however reached, if one makes RHS expressions being adjusted by =SAFETYOPTION as I suggested here.

At some point you want to mix both convenience types and you get the same problem again with yet more complexity.

bnprks commented 2 years ago

@matu3ba

I'm not sure why it's better to have to add an @intcast to generate the panic rather than just having the shift operator generate the panic.

A hypothetical to explain my argument: with the logic of the current shift operator, we don't allow u32 << u32 since the RHS can hold values that would result in undefined behavior. But by very similar logic, I could argue u32 + u32 should return a u33, or u32 * u32 should return u64 (since otherwise the operation might overflow, which is an edge case we need to users to explicitly consider). In this hypothetical, anyone who wants to get back a u32 could add an explicit @intCast or @truncate on the result to communicate their intent in an explicit way.

My point here is that for + and * Zig has decided that the operator itself can cause a panic rather than forcing explicit casting at every usage, presumably because it would greatly impair the readability of compact operator syntax if we forced casting everywhere. I'd argue that << should be in the same boat for much the same reasons, and could easily slot in to the documentation's list of operators that can cause integer overflow (link).

I'll note that << already chooses the overflow + panic option for non-power-of-two bitwidths as in my u10 << u4 example above where the log2 bitwidth requirement seems like it just serves to create a false sense of safety.

I wouldn't mind having other shift operator variants to help communicate other intents, similar to having +% for wrapping arithmetic, but I think it's reasonable for Zig to have a convenient shift operator that matches the C/C++ semantics & performance without introducing a bunch of casting. Banning signed integers on the RHS while removing the log2 bitwidth requirement could still make sense, as it would greatly reduce the amount of casting required.

RogierBrussee commented 2 years ago

For a programmer, it doesn't matter how they are represented in memory

If you want to do stuff optimal on the hardware, you need to understand how things are represented and operated on in memory.

Agreed.

The programmer knows always more about how memory should be used than the optimizing compiler.

Always is inevitably a vast oversimplification. In particular see your own words below.

Typically the programmer cant reason about the program as a whole (which also goes for math), so the compiler helps with that to optimize stuff like math.

but always keep the operation safe

This is not possible. You cant do formula replacement to optimally code, if intermediate results overflowing is well defined. Read ralphs take on why compilers need UB.

First I quote from the blog yo post

"I fully agree with Yodaiken that C has a problem, and that reliably writing C has become incredibly hard since undefined behavior is so difficult to avoid. It is certainly worth reducing the amount of things that can cause UB in C, and developing practical tools to detect more advanced kinds of UB such as strict aliasing violations."

The argument that undefined behaviour may be sort of unavoidable in some situations does not make it good.

Now more to the point of the question. Whether shifts should have undefined behaviour for shifts depends on the interpretation of your 32 bits . That is the point of my remarks on 13 march. Having distinct datatypes for distinct interpretations : as signed or non negative integers, modular (aka wrapping) integers (like C's unsigned int) or a bitfield give different interpretation of what a shift over more than the (bitlength -1) should be.

In particular I agree with @bnprks that if u32 is interpreted as a poor mans approximation of a non negative integer you may as well make it undefined behaviour, to shift over 32 or more bits, and so it is OK (up to undefined behaviour!!!!!) to say

(a << b) == (a << (b&31))

The right hand side is what essentially all hardware supports natively (Arm32 would prefer (a << b ) == (a << (b&255) )) I would just rather write that operation (compiling to the same assembly , hence equally efficient) as

a * 2^b.

(here 2^b is 2 to the power b NOT 2 xor b) with operator precedence rules and notation that one can expect from high school and visible overflow that reflects the intended use.

If you interpret u32 as a modular (aka wrapping) integer there is only one sane way to interpret a << b without any undefined behaviour: multiplication by 2^b mod 2^bitlength aka clean sweep shift. e.g for a :u32 and b :u32 this works out as

(a << b) == (b < 32)? (a << (b & 31)) : 0 ( == ( -(b< 32))& (a << (b &31)) )

I, and it seems @bnprks, think one should use a different notation for this operatin e.g. (a <<% b). However, I would prefer even more to have separate types m32 for a modular (aka wrapping) 32 integer mod 2^32 and write a *% (2^%b), (which again is neither less nor more efficient than the above)).

If you interpret u32 as a bitfield the bitfield equivalent of a <<% b (i.e. clean sweep and a << (b &31) are both sane and without undefined behaviour, but the former is slightly saner and the latter slightly more efficient. I would just prefer to have a separate type bitfield32 or b32 or whatever.

I think talk about complexity is misguided here. Things are equally efficient but conceptually simpler , especially since it clears up undefined behaviour.

it would do even better, if it didn't mix bit fields and numbers

Can you please elaborate how this justifies the added complexity in relation to my arguments above?

The number representation is identical to u32, so the point remaining is convenience (to differentiate between "I dont want this type to use shifts(<<)" and "go fast". The same effect could be however reached, if one makes RHS expressions being adjusted by =SAFETYOPTION as I suggested here.

At some point you want to mix both convenience types and you get the same problem again with yet more complexity.

matu3ba commented 2 years ago

@bnprks Your wish for consistency is planned to be done. @RogierBrussee cc

@SpexGuy : "It is planned to change the rules for a << b to "b must be unsigned and have <= the number of bits in a. So u32 << u32 will be allowed but u32 << u64 and u32 << i32 will not."

my additional question: "a << b, with b <= log2(a) forces one to think about the possible value range. This can be helpful, but it does not need to be."

SpexGuy: "Right, the reason in theory was that it separates two separable operations -- the cast can be either truncate or intCast, and then it shifts. However, this doesn't hold with non-power-of-two integers, and it's pretty annoying, so we've decided to do intCast implicitly. We can't do overshifting or signed shifting by default because it would be very difficult to optimize. Many processors mask the shift operand to 5 bits for 32 bit shifts, so extra instructions are needed to do overshifting correctly. Signed shifting is even harder, because you need to determine whether to do a left shift or right shift instruction."

matu3ba commented 2 years ago

a * 2^b.

I dont understand the problem. You can use a <<| b and there was also wrap around shifting <<% planned (I cant find the issue now). @travisstaloch has written an issue for it.

daurnimator commented 2 years ago

by very similar logic, I could argue u32 + u32 should return a u33, or u32 * u32 should return u64 (since otherwise the operation might overflow, which is an edge case we need to users to explicitly consider). In this hypothetical, anyone who wants to get back a u32 could add an explicit @intCast or @truncate on the result to communicate their intent in an explicit way.

I think this is the way zig should go. I see https://github.com/ziglang/zig/issues/3806 as a prerequisite though.

I think we should lock this issue until https://github.com/ziglang/zig/issues/3806 is either accepted+implemented or rejected.

RogierBrussee commented 2 years ago

a * 2^b.

I dont understand the problem. You can use a <<| b and there was also wrap around shifting <<% planned (I cant find the issue now). @travisstaloch has written an issue for it.

I have no clue what 'a <<| b' is supposed to mean.

If "wrap around shifting" is another name for rotating, it is a very useful operation (on bitfields, not on numbers, not on modular numbers!!) available natively on near all CPU's, but using the notation <<% is inconsistent with +%, -%, and *%

if for x: u32, n: u32 the expression (x << n) can be interpreted as "the nonnegative integer x times 2 to the power n if it fits in an u32 and undefined behaviour otherwise"

then

(x <<% n)

should mean clean sweep shift, because if % means "use the corresponding operation but interpret it as being done using modular (aka wrapping) arithmetic (like your CPU does)" the sensible meaning is

(x <<% n) == "(x times 2 to the power n) modulo 2 to the power 32" == "(x modulo 2 to the power 32) times (2 modulo 2 to the power 32) to the power n == 0 if b >=32 ( because an integer with 32 or more binary zeros is divisible by 2 to the power 32), x << (b &31) if b <=31 ( because bits "shifted out" do NOT overflow but simply correspond to ignoring numbers divisible by 2 to the power 32 as they are equivalent to 0 mod 2 to the power 32)

NOTE1: Defined in this way, there is no undefined behaviour in <<%, and it has a clear mathematical meaning. NOTE2: Rotate right (left)can be just written @rotr32( u32 , u32) ( @rotl32(u32, u32) ) (or better still @rotr32(u32, m5), @rotl32(u32, m5) where m5 is a 5 bit modular integer).

Now why would I prefer to write (x << n) for numbers in u32 as x2^n ( x times (2 to the power n)) and x % 2 ^% n
"( (x mod 2 to the power 32) times (2 mod 2 to the power 32) to the power n)" for ( x <<% n) ? Because

it has the right operator precedence for the arithmetic operator it represents (unlike << ), and
The fact that we are having this conversation shows it is confusing in the details to use the bit shift operator as an arithmetic operator, requiring detailed understanding of how 2 complement representations work (including for negative numbers if you use x: i32) and how it stops working if you overflow.

Quick quiz1: what should be undefined behaviour and the proper semantics of ( y << n) and ( y <<% n) for y: i32, n :u32 Quick quiz2: What if any is the difference between ( x >> n) and x / (2^n) for x :u32 and for x: i32.

ityonemo commented 2 years ago

> using the notation <<% is inconsistent with +%, -%, and *%

100 times this. Please don't use <<% for "wrap around shifting". If my understanding of how the zig tokenizer works, it would be possible to pick "basically anything else"

tsmanner commented 2 years ago

I think that there are some cases where bitwise operations can be mixed with arithmetic operations and make a lot of sense. The use case I'm thinking of is related to hardware modeling and simulation. I would really like to be able to use zig for that task as an alternative to C, and zig's integral types have a lot going for them that makes them WAY more ergonomic than C's.

Example: It's pretty common to have to read a handful of bits out of several different signal facilities, where each one contains some different subset of a memory address. Those values have to be shifted and then bitwise OR'd together to form the address. That address is very likely something that math is done with to determine things like it's offset into a cache line or page.

My point here is not that this would become impossible with separate bitwise types, but rather to provide a counter-example to the assertion that bitwise and arithmetic operations categorically shouldn't mix.

matu3ba commented 2 years ago

@tsmanner Do you have a blog entry or elaboration how you would envision such things? Hardware modeling is very broad in scope and has several dedicated DSLs.

Operator overloading will likely not get accepted and operations on types, which are not aligned at byte boundary (at least by one bound, see exotic integers PR #10858), but I may be wrong on this. Note to myself: docs on exotic integers are missing alignment behavior for @truncate. However bit hacking with corresponding offset shifts that is comptime constructed+checked sounds to be in scope to me.

I suspect that you will likely end up with a DSL (type representation and operations), if you want very ergonomic behavior for the use case. However, I think any (comptime) interface description (or even comptime gen) to such a DSL from Zig would be highly appreciated.

counter-example to the assertion that bitwise and arithmetic operations categorically shouldn't mix

As more C-like low-level language Zig intends to provide a fast "overview of the underlying memory representation" and ergonomic+type-checked "close" representation of underlying hardware operations. How would this align with your vision? This sounds much like operator overloading to me or are there (common) hardware instructions that perform both at once with performance gain?

tsmanner commented 2 years ago

@matu3ba I don't have a blog post or anything, just a lot of practical experience doing hardware simulations directly in C and C++, no DSLs. There are a few cases where operator overloading in C++ can be helpful, but we use it very sparingly.

The ergonomics I'm talking about have to do with signal width. It's very rare, in my experience, that a signal is 8, 16, 32, or 64 bits wide. Getting data from the hardware model, and putting data into it, requires some extra runtime safety checking on it to make sure that the high-order bits in the uintN_t are zeroed. At the moment, we implement these by hand and hope that no one misses any. Signals wider than 32 bits are returned as a 0-terminated C array of unsigned int.

My interest in zig for this is mostly centered around [nearly] arbitrary unsigned integer widths, especially when extracting fields from a very wide hardware signal (e.g. a latch bank with 118 bits in it) that cross a word boundary.

Example: consider a processor with 64-bit addresses and an L1 cache like this

64K
8-way set associativity
64B cache line size
128 cache lines per set (1024 total)
Logically indexed

To check cache reads and writes, we can set an expectation for the address and cache location that should be accessed based on an instruction's operand address. Looking at the actual hardware model, we have to reconstruct the address from 3 pieces of information in there:

the cache read/write index
the TLB's logical address field (which typically doesn't contain the index bits, as they can be inferred from it's location)
the byte offset of the load/store instruction into the cache line

uint8_t setid = fCacheReadWriteSetId.get();  // get the Set ID the hardware model is accessing
uint8_t index = fCacheReadWriteIndex.get();  // get the line index the hardware model is accessing
// This is two values, because the TLB entry is larger than 64 bits.  It must contain both
// the logical address bits down to the cache index, and the full translated page address
// for the cache line.
uint64_t tlbEntry[2] = { fTlb[setid][index][0].get(), fTlb[setid][index][1].get() };  // get the TLB entry
// Some messy Bitwise AND and shift operations to isolate the logical address bits from
// the larger-than-64-bit TLB entry.
uint64_t addr = (extractLogicalAddressBits(tlbEntry) << (6 + 7))  // shift the most significant bits of the address from the TLB into place
              | (static_cast<uin64_t>(index) << 6)  // cast and shift and OR the index bits into place
              | static_cast<uint64_t>(fOperandByteOffset.get()); // cast and OR the byte offset into place
expectEqual(expected_address, addr);  // Check it

const setid: u3 = fCacheReadWriteSetId.get();
const index: u7 = fCacheReadWriteIndex.get();
// 103 bits because 64 - 6 - 7 for the logical address is 51 bits in the TLB, and the absolute
// is probably translated on a 4k page boundary, which means the 12 least significant bits
// are always the same as the logical ones, leaving 52 translated bits.
const tlbEntry: u103 = (@intCast(u103, fTlb[setid][index][0].get()) << 64) | (@intCast(u103, fTlb[setid][index][1].get());
const addr: u64 = @intCast(u64, (tlbEntry & cTlbLogAddrMask) >> (52 - 6 - 7))
                | (@intCast(u64, index) << 6)
                | @intCast(u64, fOperandByteOffset.get();

In the C++ example, the extractLogicalAddressBits function is ugly and error prone to have a human being write, in my experience. If we're lucky with the field we want, it's one bitwise AND and a shift right. If we're unlucky, then it's two bitwise ANDs, a shift right for the less significant bits, and a shift left or right for the more significant bits, depending on where they are in the first array element and where they belong in the result, and finally an OR to combine them into the address. The zig implementation of the same is much easier to read, seems easier to optimize, and more directly maps to my understanding of the hardware I'm modeling which makes it less likely to contain a bug and easier to debug if one does exist.

The reconstructed address, I can then do arithmetic (addition and subtraction at least) and ordering compares with, to check things like distances between operands and whether or not the address lies within a special range (e.g. microcode reserved address ranges).

I'm not aware of any common hardware instructions that do both bitwise and arithmetic operations at once. Having separate bitwise types and arithmetic types would add steps between those operations though, and my point was that the same value can reasonably be used both ways. In the example above, given the addr, the cache index should be extracted with a bitwise AND and a right shift, while it's distance to another address is calculated using arithmetic subtraction.

I hope that answers your question, or at least provides some insight into my motivations.

matu3ba commented 2 years ago

@tsmanner As I understand it, it is a common use case to use bit operations to splice/combine integers, and then to do arithmetic on the combined result, not just bitops.

For the following questions I do assume that you are not blocked by for example #10684 and #10920. If you are, please write into the comments (or contribute on fixing it).

Did you try to implement comptime-generation and validation of the necessary masks (if the bit-offset is not byte-aligned) for a simple use case? I am abit surprised that there is no tooling to verify/validate/(auto-)generate the masks from the hardware description, as this sounds like a very common task.
Did you thought about making a zig library, ie some examples for a few architectures? Usually, it is simpler to generalize the problem(s) with existence/annoyance of workarounds in a more complete (even if toy) application.
Did you play with comptime-generating the necessary stuff for extractLogicalAddressBits and write it as c/c++ file at runtime?

Testing implementations against another sounds very useful to check for correctness of exotic integers + comptime and sounds also useful to check your hand-crafted stuff.

tsmanner commented 2 years ago

@matu3ba you are correct, I am not blocked by anything. I was only trying to provide a concrete example of a user wanting to do both bitops and arithmetic with the same value.

I like your ideas about how to go about implementing them, my own are similar. The code base I work with is pretty old, and most of the people working on it have expertise in hardware design not software design. The hand rolled stuff is, unfortunately, very common in there still.

RogierBrussee commented 2 years ago

Of course people splice together bits to integers and use them as indexes in arrays, but your example does not actually use artithmetic (+,-,*) and bitops (&, |,^, <<, rotl, ...) together.

I think your example would be written something like (assuming bxxx is an xxx bit bitfield, and @zeroext zeroextends a bitfield or the bits of an integer with given number of bits)

const setid: u3 = fCacheReadWriteSetId.get(); const index: u7 = fCacheReadWriteIndex.get(); // 103 bits because 64 - 6 - 7 for the logical address is 51 bits in the TLB, and the absolute // is probably translated on a 4k page boundary, which means the 12 least significant bits // are always the same as the logical ones, leaving 52 translated bits. const tlbEntry: b103 = @zeroext(b103, fTlb[setid][index][0].get()) << 64 | @zeroext(b103, fTlb[setid][index][1].get(); const addr: u64 = @as(u64,

                            @intCast(b64, (tlbEntry &

cTlbLogAddrMask) >> 52 - 6 - 7 ) //no need for parenthesised (52 - 6 - 7)

@.***(b64, index) << 6)

                            | @zeroext(b64, fOperandByteOffset.get())

              );

It would also make sense to write @intCast(b64, (tlbEntry & cTlbLogAddrMask) >> 52 - 6 - 7 ) as

@bitfield(b103, tlbEntry & cTlbLogAddrMask, 52 - 6 - 7, 103).

In my opinion explicitly marking where the bit representation is used makes things clearer, YMMV.

On Sun, Feb 20, 2022 at 4:38 PM matu3ba @.***> wrote:

@tsmanner https://github.com/tsmanner As I understand it, it is a common use case to use bit operations to splice/combine integers, and then to do arithmetic on the combined result, not just bitops.

For the following questions I do assume that you are not blocked by for example #10684 https://github.com/ziglang/zig/issues/10684 and #10920 https://github.com/ziglang/zig/issues/10920. If you are, please write into the comments (or contribute on fixing it).

Did you try to implement comptime-generation and validation of the necessary masks (if the bit-offset is not byte-aligned) for a simple use case? I am abit surprised that there is no tooling to verify/validate/(auto-)generate the masks from the hardware description, as this sounds like a very common task.

Did you thought about making a zig library, ie some examples for a few architectures? Usually, it is simpler to generalize the problem(s) with existence/annoyance of workarounds in a more complete (even if toy) application.

Did you play with comptime-generating the necessary stuff for extractLogicalAddressBits and write it as c/c++ file at runtime?

Testing implementations against another sounds very useful to check for correctness of exotic integers + comptime and sounds also useful to check your hand-crafted stuff.

— Reply to this email directly, view it on GitHub https://github.com/ziglang/zig/issues/7605#issuecomment-1046264026, or unsubscribe https://github.com/notifications/unsubscribe-auth/AR7SWUPPGAA5TYGAO7T2B4LU4EDGHANCNFSM4VOVDXFA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

tsmanner commented 2 years ago

That is correct which is why my post was about ergonomics and not the ability to accomplish. Requiring casting between bitwise and arithmetic types just so that shift+or construction of values in my example doesn't lend any expressiveness to the program that isn't already present in the selection of type (e.g. u103) or in the operations being done (bitwise and, shift, bitwise or). A bitstring type does sound really handy, but requiring that bitwise operations only occur on them sounds like a job for a linter, not a compiler to me.

ogier commented 11 months ago

I think another reason why it's important to keep the cast explicit is Zig doesn't implicitly integer promote the lhs of a shift operator but C/C++ do. In C/C++ it's always legal to left-shift by values < 31, but some of these are nonsense in Zig and I think it's useful to have a compiler error to say so.

For example these are legal shift operations in C:

uint32_t shift1() {
    uint8_t lhs = 0xab;
    uint8_t rhs = 24;
    return lhs << rhs; // 0xab000000u
}

uint32_t shift2() {
    uint8_t lhs = 0xab;
    return lhs << 24; // 0xab000000u
}

In Zig you don't get automatic integer promotion like this which means I think both of these programs should remain compilation errors lest we give C/C++ programmers more ways to shoot themselves in the foot:

export fn shift1() u32 {
    var lhs: u8 = 0xab;
    var rhs: u8 = 24;
    // error: expected type 'u3', found 'u8'
    // note: unsigned 3-bit int cannot represent all possible unsigned 8-bit values
    return lhs << rhs;
}

export fn shift2() u32 {
    const lhs: u8 = 0xab;
    // error: type 'u3' cannot represent integer value '24'
    return lhs << 24;
}