ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.78k stars 2.54k forks source link

allowzero pointers don't work on wasm32-freestanding in release mode. #4668

Closed puzzleddev closed 4 days ago

puzzleddev commented 4 years ago

When compiling for wasm32-freestanding, writes to allowzero pointers, which are known at compile time to point to 0x0000 result in unreachable instructions.

A minimal example tested on master #80ff549e2 (latest at the time of writing) can be found here: https://gist.github.com/puzzleddev/620c2ddd28cb9c7a1f1b0f01673364ff

This really isn't important, given that there is no real reason to write to 0x0000 in WASM. It does, however, support such writes and, as such allowzero should function or an exemption has to be made. The WASM people tried to do so a while ago as seen here: https://github.com/WebAssembly/design/issues/204

This seems to have gone nowhere, as today the memory instruction section mentions that accessible addresses start at 0 (https://webassembly.github.io/spec/core/syntax/instructions.html#syntax-instr-memory).

LemonBoy commented 4 years ago

result in unreachable instructions

Do you have a stack trace? Is the problem inside the compiler or in the generated binary?

puzzleddev commented 4 years ago

It will generate a program no problem, just the write to 0x0000 is optimized as if it was invalid.

According to this IR dump, Zig generates the proper instructions, so it may be LLVM that's the problem here.

fengb commented 4 years ago

https://godbolt.org/z/FQk5t3 raw wasm dump shows this:

  (func $glue_startup (type 0) (result i32)
    unreachable
    unreachable)
andrewrk commented 4 years ago

I think we have to do some annoying things with "address spaces" in LLVM to make the semantics correct. If you search http://llvm.org/docs/LangRef.html for "address space" you can find a bunch of separate things talking about this special address space 0, and how LLVM makes all kinds of assumptions about address 0 in this address space.

joshgoebel commented 2 years ago

@puzzleddev I'm seeing the same thing you area... There is generated code, but it's marked as null by LLVM and in the compiled WASM it seems to disappear entirely:

@FRAMEBUFFER = internal unnamed_addr constant [16320 x i8]* null, align 4
@FRAMEBUFFER2 = internal unnamed_addr constant [16320 x i8]* inttoptr (i32 1 to [16320 x i8]*), align 4

...

 WhileBody:                                        ; preds = %WhileCond
   store i8 -103, i8* null, align 1
   %2 = load i32, i32* %i, align 4
   %3 = getelementptr inbounds [16320 x i8], [16320 x i8]* null, i32 0, i32 %2
   store i8 86, i8* %3, align 1
   %4 = load i32, i32* %i, align 4
   %5 = getelementptr inbounds [16320 x i8], [16320 x i8]* inttoptr (i32 1 to [16320 x i8]*), i32 0, i32 %     4
   store i8 103, i8* %5, align 1
joshgoebel commented 2 years ago

Oh, wait... yes with a single discrete pointer I get the unreachable behavior, but with an array pointer I just get the code dropped...

pub const ZERO: *allowzero u8 = @intToPtr(*allowzero u8, 0);
pub const FRAMEBUFFER: *allowzero [16320]u8 = @intToPtr(*allowzero[16320]u8, 0);
pub const FRAMEBUFFER2: * [16320]u8 = @intToPtr(*[16320]u8, 1);
    // compiles to unreachable
    tic.ZERO.* = 0x99;

    // doesn't appear in the generated WASM at all
    tic.FRAMEBUFFER.*[i]=0x56;

    // works as expected
    tic.FRAMEBUFFER2.*[i]=0x67;

This is a problem when trying to use Zig/LLVM for TIC-80 since our VRAM starts at address 0 of memory space.

joshgoebel commented 2 years ago

I fixed my problem by adding volatile though I'm not 100% sure what voodoo this is doing...

pub const FRAMEBUFFER: *allowzero volatile [16320]u8 = @intToPtr(*allowzero volatile [16320]u8, 0);

But the behavior is now correct and I can address the VRAM from position 0 and forward. Looks like LLVM adds a volatile to the opcode:

--- before.log  2022-01-03 20:41:23.000000000 -0500
+++ after.log   2022-01-03 20:41:29.000000000 -0500
@@ -157,7 +157,7 @@
   %2 = load i32, i32* %i, align 4
   %3 = load [16320 x i8]*, [16320 x i8]** @FRAMEBUFFER, align 4
   %4 = getelementptr inbounds [16320 x i8], [16320 x i8]* %3, i32 0, i32 %2
-  store i8 86, i8* %4, align 1
+  store volatile i8 86, i8* %4, align 1
   %5 = load i32, i32* %i, align 4
   %6 = add nuw i32 %5, 1
   store i32 %6, i32* %i, align 4

That's the whole diff...

alexrp commented 4 days ago

Closing as duplicate of #15816 which has more information on how this should be resolved.

xdBronch commented 4 days ago

this is #4668 :p, guessing you meant #15816

alexrp commented 4 days ago

Oops, yes!