Closed andrewrk closed 3 months ago
I've been playing around with using zig to write things for the fomu and I ended up with a sort of annoying pattern for the MMIO:
/// https://rm.fomu.im/touch.html
pub const TOUCH = struct {
const base = 0xe0005800;
/// Output values for pads 1-4
pub const TOUCH_O = @intToPtr(*volatile u4, base + 0x0);
/// Output enable control for pads 1-4
pub const TOUCH_OE = @intToPtr(*volatile u4, base + 0x4);
/// Input value for pads 1-4
pub const TOUCH_1 = @intToPtr(*volatile u4, base + 0x8);
};
I attempted to make it just one extern struct and use align(4)
for layout, but that didn't seem to work.
Volatile was on the struct pointer itself and not at the field level.
Perhaps volatile
should not be a pointer attribute, but a field/variable attribute?
I really like the idea, but i shiver at the same time because of the overhead that happens when i would like to write embedded code.
A short example (ignoring headers/imports) for AVR that let's an LED blink:
const TIMER1_FREQ = 2; // Hz
const TIMER1_PRESC = 1024;
const TIMER1_LIMIT = (((F_CPU) / (TIMER1_PRESC)) / (TIMER1_FREQ) - 1);
pub fn main() int {
@volatileStore(DDRB, (1 << PIN5));
@volatileStore(TCCR1A, 0);
@volatileStore(TCCR1B, (1 << CS12) | (1 << CS10) | (1 << WGM12));
@volatileStore(TCCR1C, 0);
@volatileStore(OCR1AH, @truncate(u8, (TIMER1_LIMIT >> 8) & 0xFF));
@volatileStore(OCR1AL, @truncate(u8, (TIMER1_LIMIT & 0xFF)));
@volatileStore(TIMSK1, (1 << OCIE1A));
sei();
while (true) {}
}
pub fn TIMER1_COMPA_vect() callconv(.Naked) void { // hoverever this will look in actual Zig...
@volatileStore(PORTB, @volatileLoad(PORTB) ^ (1 << PIN5)); // this has to be optimized to a singular instruction by the optimizer later…
}
The variant with @volatileStore
and @volatileLoad
is 40% larger (648 byte) compared to the variant with DDRB = (1 << PIN5)
(453 byte).
For most applications i really like the idea of having volatile
being made explicit, but it also makes zig less attractive to embedded developers as they have to write a lot more code (another example here).
After writing the second example, i really think we both win and lose "Communicate intent precisely." as i find the @volatileLoad
-Version much harder to read and grasp what the code actually does (which is: setting/clearing some bits) in a memory location
Can you add a link to the example that is a smaller binary?
On Sun, Jan 26, 2020, 6:27 AM Felix Queißner notifications@github.com wrote:
I really like the idea, but i shiver at the same time because of the overhead that happens when i would like to write embedded code.
A short example (ignoring headers/imports) for AVR that let's an LED blink:
const TIMER1_FREQ = 2; // Hz const TIMER1_PRESC = 1024; const TIMER1_LIMIT = (((F_CPU) / (TIMER1_PRESC)) / (TIMER1_FREQ) - 1);
pub fn main() int {
@volatileStore(DDRB, (1 << PIN5)); @volatileStore(TCCR1A, 0); @volatileStore(TCCR1B, (1 << CS12) | (1 << CS10) | (1 << WGM12)); @volatileStore(TCCR1C, 0); @volatileStore(OCR1AH, @truncate(u8, (TIMER1_LIMIT >> 8) & 0xFF)); @volatileStore(OCR1AL, @truncate(u8, (TIMER1_LIMIT & 0xFF))); @volatileStore(TIMSK1, (1 << OCIE1A)); sei(); while (true) {}
}
pub fn TIMER1_COMPA_vect() callconv(.Naked) void { // hoverever this will look in actual Zig...
@volatileStore(PORTB, @volatileLoad(PORTB) ^ (1 << PIN5)); // this has to be optimized to a singular instruction by the optimizer later…
}
The variant with @volatileStore and @volatileLoad is 40% larger (648 byte) compared to the variant with DDRB = (1 << PIN5) (453 byte).
For most applications i really like the idea of having volatile being made explicit, but it also makes zig less attractive to embedded developers as they have to write a lot more code (another example here https://gist.github.com/MasterQ32/d17debefd159821b1e9debe0e869eef0).
After writing the second example https://gist.github.com/MasterQ32/d17debefd159821b1e9debe0e869eef0, i really think we both win and lose "Communicate intent precisely." as i find the @volatileLoad-Version much harder to read and grasp what the code actually does (which is: setting/clearing some bits) in a memory location
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ziglang/zig/issues/4284?email_source=notifications&email_token=AJOF5C2HI5HTUW2Y7TBHJHTQ7V6U5A5CNFSM4KLPPD32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ5SWBI#issuecomment-578497285, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJOF5C3KBRZ4SN5OTJ2DIITQ7V6U5ANCNFSM4KLPPD3Q .
Somewhat on a tangent...
Is volatile necessary? If you have multiple threads of control, you probably need atomic accesses. If you do not, you would need something similar to volatile. Zig already behaves differently when you indicate to the compiler than you are going to be single threaded. Perhaps volatile is just what results when you use atomic accesses in a single threaded program?
This definitely has some overhead, but simplifies the mental model.
As to the problem of using @volatileStore
and @volatileLoad
perhaps special forms of =
would help? There are already special +
such as %+
etc. for different uses. What about @=
or something (please do not actually use that one!) to indicate to the compiler, "really do that load/store no matter what optimizations seem to apply."
One thing I like very much with @volatileStore
and @volatileLoad
is that intent is communicated very precisely exactly where volatile behavior is needed. Having volatile
as an attribute on a variable or type means that you do not actually know what happens when you see an assignment somewhere elsewhere in the code. Spooky action at a distance.
The amount of typing seems like a fairly weak argument. You read code and need to understand it a lot more than you type it. This is a problem in C where you can hide the volatile
in a header or typedef etc. and have to look up the actual definition site of each variable to see what happens with a simple assignment.
This really comes down to what you are trying to show: Is a variable volatile
or are assignments to/from that variable volatile
? C does this on the variable. The new @volatileStore
and @volatileLoad
do this at the site of the assignment.
Now back to lurking...
Is volatile necessary?
Yes, absolutely. It's not meant for thread synchronization / atomics, but it's for MMIO which enforces reads/writes to happen without reordering.
Is a variable volatile or are assignments to/from that variable volatile?
That's a good question. I'd answer this with "volatile
is the property of a memory location".
It would always be an error to access a MMIO location without volatile and accessing a non-MMIO location with volatile (instead of an atomic).
So @volatileLoad
and @volatileStore
would open up the possibility to access volatile storage without volatile (by just not using it) and thus would result in a programming error.
Is volatile necessary?
Yes, absolutely. It's not meant for thread synchronization / atomics, but it's for MMIO which enforces reads/writes to happen without reordering.
I think you misunderstood the half-baked idea. The core of the idea was that atomics are volatile accesses plus memory barriers (sort of). So in the case of a single-threaded program, you do not need the memory barriers, so atomic accesses degenerate into volatile accesses. Given that, do you actually need a separate volatile
access method or can you use atomic
instead and take the hit on performance (which you might anyway) if you do have a multi-threaded program and actually only needed volatile
semantics? I was aiming for a simpler mental model.
Is a variable volatile or are assignments to/from that variable volatile?
That's a good question. I'd answer this with "volatile is the property of a memory location".
This does seem to be the core question. That needs to be answered and accepted before either of the current alternatives can be usefully discussed. Otherwise this is really two solutions for two different problems :-(
So
@volatileLoad
and@volatileStore
would open up the possibility to access volatile storage without volatile (by just not using it) and thus would result in a programming error.
That is a good point. Whatever the solution I would argue that just having volatile
as an attribute of a variable or type creates code behavior changes elsewhere and thus violates the Be Explicit commandment. As you note using @volatileLoad/Store
does not strongly encourage safety. Requiring both attributes for volatile
and @volatileLoad/Store
would be safer, but does not have great UX. Hmm...
At work, we have recently started a new project using a cortex-a5 and FreeRTOS.
The IP stack that we are evaluating (pretty expensive) was causing data aborts when the optimizations were turned up. The issue is that the DMA buffers used to pull data from the Ethernet MAC have to be placed in strongly-ordered or device memory sections which are setup when programming the stage 1 translation tables of the MMU. The thing about strongly ordered and device memory regions is that unaligned accesses cause an alignment fault, regardless of whether unaligned access is enabled for normal memory. We are using the latest Linaro arm-none-eabi-gcc toolchain for cortex-a processors from the arm website and there is not a very good way to prevent the compiler from emitting instructions that will cause an alignment fault, even when packing the bytes in and out manually for data types that are wider than 1 byte.
EDIT: my link insertion was broken. See this godbolt link for a (stripped) example: https://godbolt.org/z/GFKo5w
Note the problem exists even for the architectures native data type.
Note also that -ffreestanding is needed to signify that builtins are not to be used and thus compile-time known memcpy, memset, etc. don't get inlined and thus insert unaligned accesses where you don't expect them. They are forced to go through function calls which actually handle alignment issues. (These examples don't use memcpy, but it illustrates that the problem further)
There are two compiler windows open so you can see that using the "-mno-unaligned-access" option during compilation would prevent the compiler from smushing, for lack of a better term, the reads together into a single unaligned load because the alignment of the pointee type is 1 and so it can't tell if the accesses are truly aligned or not.
The option can only be given on the command line, however, and applies to the entire compilation unit so it must be known, outside of reading the code, what the compilation environment is expected to be. You could of course have a file containing just these functions for which unaligned loads are verboten and compile them separately from the rest of your application code, but that also requires special knowledge that is not apparent in the code unless it is in a comment somewhere. Hope you see it before you ship. Having the function accept only volatile pointers keeps all the information write there in the source code, which I consider a win.
I hereby assert, with this real world use case in hand, that the volatile keyword does have a valid meaning and use case outstide of the colloquial definition of mmio, usually meant to be peripheral registers. The volatile keyword forces the compiler to actually schedule a write or a store for each memory access through a volatile pointer because it can't assume there are no side effects for each memory access. As said by others before, this has nothing to do with atomics or the order in which the writes arrive at the destination, but the use case still exists and is valid.
You could argue that strongly ordered memory is a sort of mmio. Accessing it through the builtins only would make programming errors more likely.
In Zig, if slices could be declared as volatile, I would make the DMA buffers be volatile and thus only the correct functions that appropriately handle alignment issues would even be allowed to be called.
After using zig for months now, I still have no idea what "has no side effects" from llvm's docs is supposed to mean. This description from @vegecode of what volatile currently does makes more sense:
The volatile keyword forces the compiler to actually schedule a write or a store for each memory access through a volatile pointer
The cpu can modify certain registers/memory locations whenever it wants, if compiler doesn't know this it will try to be smart and skip or reorder steps, which is sometimes unwanted.
Since people seem to always confuse the c volatile with java's volatile (myself included initially and from sitting on IRC). Maybe it'd be better to just change the keyword to something more clear, that means don't optimize this out
.
Also, I ported a small portion of the stm32h743 hal to pure zig (uart, timers, gpio, interrupts, and a tft driver). It's roughly 6k loc.
Pretty much every register read or write is done using a small utils.zig
with
// ----------------------------------------------------------------------------
// Register utilities
// ----------------------------------------------------------------------------
pub inline fn setBit(reg: *volatile u32, bit: u32) void {
reg.* |= bit;
}
pub inline fn toggleBit(reg: *volatile u32, bit: u32) void {
reg.* ^= bit;
}
pub inline fn clearBit(reg: *volatile u32, bit: u32) void {
reg.* &= ~bit;
}
pub inline fn readBit(reg: *volatile u32, bit: u32) u32 {
return readReg(reg) & bit;
}
pub inline fn isBitSet(reg: *volatile u32, bit: u32) bool {
return (readReg(reg) & bit) == bit;
}
pub inline fn clearReg(reg: *volatile u32) void {
reg.* = 0x0;
}
pub inline fn writeReg(reg: *volatile u32, val: u32) void {
reg.* = val;
}
pub inline fn readReg(reg: *volatile u32) u32 {
return reg.*;
}
pub inline fn modifyReg(reg: *volatile u32, clear_mask: u32, set_mask: u32) void {
return writeReg(reg, (readReg(reg) & ~clear_mask) | set_mask);
}
pub inline fn writeType(comptime T: type, reg: *volatile T, val: T) void {
reg.* = val;
}
pub inline fn readType(comptime T: type, reg: *volatile T) T {
return reg.*;
}
and work on pointers translated from the chip's headers eg
pub const GPIO_TypeDef = struct {
MODER: u32,
OTYPER: u32,
OSPEEDR: u32,
PUPDR: u32,
IDR: u32,
ODR: u32,
BSRR: u32,
LCKR: u32,
AFR: [2]u32,
};
pub const GPIOA_BASE = D3_AHB1PERIPH_BASE + @as(u32, 0x0);
pub const GPIOA = @intToPtr(*volatile GPIO_TypeDef, GPIOA_BASE);
// Used like
pub inline fn set(self: *Pin) void {
util.writeReg(&self.instance.BSRR, @enumToInt(self.pin));
}
For toggleBit
, setBit
and clearBit
the compiler already spits out a single optimized instruction for these in release mode, I'm not sure how those would map to a @volatileStore
without first using a @volatileLoad
.
Also, currently the compiler uses volatile
to enforce that what I'm passing in was from a struct pointer annotated with volatile
which has helped fix a few errors during development. Would these new builtins also be able to do that?
Well after watching the ccpcon video I now feel ignorant :smile: . There's much more to volatile than what I know about.
It'd be good to separate intent when and where possible. So I'm :+1: on this change (assuming it can handle more fine grained updates).
In addition to @volatileLoad
and @volatileStore
, there also needs to be a way to tell the compiler to not optimize out "unused" variables that are needed for debugging in release safe mode.
Currently I have to fight with the optimizer by doing stupid usages to prevent it from removing important things like the stack trace (which is unused until I halt and need to look at it).
For example the err
, stack
, and bt
in the fn below all get optimized out in release mode making it hard to debug. None of these are "volatile" variables in the MMIO sense, but I should still be able to somehow mark that I want to always keep them.
pub fn panic(err: []const u8, stack: ?*std.builtin.StackTrace) noreturn {
var bt: *[]usize = if (stack) |st| &st.instruction_addresses else undefined;
while (true) {
LED3.toggle();
var i: u32 = 0;
while (i < 1000000) : (i += 1) {
asm volatile ("nop");
}
}
}
Maybe it would make sense to just replace volatile
with different keywords that indicate explicitly the use case like mmio
, alignedio
, and nooptmize
?
After some discussion in Discord with GingerBill, Tetralux, Oskar and others, there were some insights:
volatile
part of the type system (prevents accessing MMIO as memory)volatile
part of the type system (it's viral and infects all of your code)Possible solutions for Zig:
volatile
as a pointer modifierStatus quo.
const ptr = @ptrCast(*volatile u32, 0xF00BA7);
ptr.* = 0xFF00FF00;
Pro:
Con:
Comment:
There should be no implicit casting between volatile and non-volatile pointers in any case. All need @ptrCast
.
volatile
similar to C as part of the typevar value : volatile u32 = undefined;
value = 0xFF00FF00;
Pro:
volatile
as a pointer modifierextern foo: volatile u32;
)Con:
volatile
as a pointer modifiervolatile
with @volatileLoad
and @volatileStore
const ptr = @ptrCast(*u32, 0xF00BA7);
@volatileStore(ptr, 0xFF00FF00);
Pro:
Con:
@volatileLoad
and @volatileStore
const ptr = @ptrCast(*volatile u32, 0xF00BA7);
@volatileStore(ptr, 0xFF00FF00);
Pro:
Con:
volatile
as a language constructvolatile
as a pointer qualifier could be removed compeltly from the language and rely on asm volatile
for the same kind of work. This would require a type to implemented in user space that implements a simple load/store for every supported platform:
function Volatile(comptime T: type) type {
return extern struct {
const Self = @This();
backing_memory: T,
fn load(self: *Self) T {
return asm volatile("…");
}
fn store(self: *Self, value: T) void {
return asm volatile("…", value);
}
};
}
const ptr = @ptrCast(*Volatile(u32), 0xF00BA7);
ptr.store(0xFF00FF00);
Pro:
Con:
load
/store
still need to prevent instruction reordering and such)Now that i think of it, i prever one of the last two solutions. The user space implementation would make the language itself simpler by removing a whole feature, but would create some work in the user space. As volatile
loads/stores don't have to be atomic in general this could be implemented pretty easily.
Also having both pointer and load/store as an explicit MMIO operation conveys the zig spirit by ignoring the amount of code to write, but preferring the explicitness.
@frmdstryr Regarding preventing the optimizer from removing a variable, volatile
or an equivalent nooptimize
could be a little heavy-handed for that, and might be easy to accidentally leave in release builds or use for the wrong purpose. The more targeted way to prevent optimization of a certain value is with the technique shown in this Chandler Carruth talk:
https://www.youtube.com/watch?v=nXaxk27zwlk&t=2446
It translates pretty directly to Zig:
inline fn escape(ptr: var) void {
comptime assert(@typeInfo(@TypeOf(ptr)) == .Pointer);
asm volatile ("" : : [g]""(ptr) : "memory");
}
inline fn clobber() void {
asm volatile ("" : : : "memory");
}
If you don't want to do targeted de-optimization and just want all variables in a scope to remain available for debugging, @optimizeFor
(https://github.com/ziglang/zig/issues/978) will handle that use case.
@MasterQ32 I don't think the last two are mutually exclusive. My preference would be to switch to @volatileLoad/Store
but provide a library type for the common case of a volatile register, similar to std.Atomic
. Something like this:
// using syntax from #5049 for now but whatever the eventual solution to this is
const CRDR_Reg = packed struct(u32) { cr: u9, dr: u9, bflag: bool, reserved: u13 };
const crdr = std.MMIO_Register(CRDR_Reg).init(CRDR_ADDRESS);
fn updateCRDR(cr: u9, dr: u9) void {
var value = crdr.read();
value.cr = cr;
value.dr = dr;
crdr.write(value);
}
This is a little more verbose but it prevents people from accidentally doing this:
const crdr = @intToPtr(*volatile CRDR_Reg, CRDR_ADDRESS);
fn updateCRDR(cr: u9, dr: u9) {
// bad: generates two reads and two writes and is not obvious about it
crdr.cr = cr;
crdr.dr = dr;
}
This case that @frmdstryr points out is actually really important:
For toggleBit, setBit and clearBit the compiler already spits out a single optimized instruction for these in release mode, I'm not sure how those would map to a
@volatileStore
without first using a@volatileLoad
.
We would probably also need a @volatileRmw
builtin similar to @atomicRmw
to handle this case properly, unless the optimizer is allowed to combine volatile loads and stores. (which might already be the case, depending on the exact language semantics of +=
)
Edit: add caveat about combining
It looks like the optimizer is allowed to combine a volatile load, modify, volatile store into a single instruction. See this godbolt. So @volatileRmw
is not necessary.
This might be a terrible idea, but what about volatile
blocks? If all loads and stores in a volatile block are treated as side effects, it becomes easy to make volatile types:
function Volatile(comptime T: type) type {
return struct {
const Self = @This();
backing_memory: T,
fn load(self: *Self) T {
volatile {
return self.backing_memory;
}
}
fn store(self: *Self, value: T) void {
volatile {
self.backing_memory = value;
}
}
};
}
This gets the advantage of making volatile usage explicit everywhere, while still allowing the compiler to know not to reorder memory access, without requiring a whole bunch of processor-specific code in the standard library. Of course it still costs a language feature.
The big down side that I see is that it allows stuff like this:
extern var foo: u8 = undefined;
...
volatile { foo = 1; }
foo += 1;
It might be possible to detect that kind of thing with the compiler (memory accessed in volatile sections can only be accessed in volatile sections) but it's pretty weird to decouple it from the type like that.
Personally, I think the status quo strikes a decent balance. Maybe the best solution is a mix of this and the status quo. I think all that visual noise in @volatileLoad
/@volatileStore
makes the language harder to read by at least as much as it makes it harder to write.
I really like volatile blocks. They imho have the most advantages (remove visual noise, explicit, flexible), remove the volatile
information from the type completly (so no viral types). We should only restrict to what is allowed to happen in a volatile
block (for example: no function calls)
One downside of using the inline functions https://github.com/ziglang/zig/issues/4284#issuecomment-588235419 above is that it makes the comments in the disassembly difficult to read in release builds...
Using the inline operations directly avoids this but is error prone.
Have volatile similar to C as part of the type ...
Con:
- It doesn't make sense in most cases (volatile local/global variables aren't really useful)
They would be very useful for interrupt handlers. Right now you would have to have both a global variable and a volatile pointer to that variable to safely communicate between an interrupt handler and the rest of the code.
I just wanted to add my point of view, I’m considering trying out zig for embedded work and volatile is essential in that space.
Bit out of left field, but I wonder if volatile
blocks could also be used for floating point arithmetic:
volatile { // Enforces reproducible results across targets
var x: f32 = 15.3;
var y = x * x + x * 1.5 - 0x1.0001p-3; // intermediate results stored in `f32` (not `f80`)
}
In this case, the meaning of volatile
is slightly broader than just "does not reorder".
Instead, it enforces the "literal meaning" of the source code, preventing re-ordering, implicit in-register type promotion, and any optimizations that would do anything to change the exact value/representation of the result. Dedicated hardware instructions that perform un-specified estimation, such as fast reciprocal square-root (rcpps
) and transcendentals (fsin
, fcos
, etc.) would also be unavailable.
If it's not possible to reproduce the literal meaning of the source code for a given target, that's a compile error.
Sounded like a good idea, what changed ? @andrewrk
Realistically, the only valid use case for
volatile
is Memory Mapped Input/Output. That is, when memory loads and stores have side effects. Normally, memory loads and stores do not have side effects.There are some edge cases to solve with volatile, as well as this talk of J.F. Bastien about deprecating volatile in C++.
This proposal is to remove the pointer attribute and keyword, in favor of two simple builtins:
@volatileLoad
@volatileStore
This would be equivalent to
@atomicLoad
and@atomicStore
except with volatile semantics rather than atomic semantics.The upsides of this would be a simpler language, and clear rules about what volatile loads and stores mean. For example,
@volatileLoad
or@volatileStore
would be a compile error if it would tear. However, when it is a pointer attribute, zig has to figure out what is supposed to happen with a tear.The downsides of this is that a volatile pointer to a packed struct is less useful. In practice it could be OK to have a packed struct which maps to MMIO structs in memory, where loads/stores tear. But I believe in practice this is rare, and the builtins would suffice. However, there is C code that uses packed structs to represent MMIO, and porting it to zig would require using less convenient
@intToPtr
+@volatileLoad
/@volatileStore
syntax.