emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.82k stars 3.31k forks source link

WebGPU Regression? #22682

Closed ypujante closed 1 month ago

ypujante commented 1 month ago

I upgraded to emscripten 3.1.68 (from 3.1.66) and my project WebGPU Shader Toy which was working fine now fails.

I am seeing error messages like these:

[WebGPU] Validation error | Write range (bufferOffset: 0, size: 1887248) does not fit in [Buffer "Dear ImGui Vertex buffer"] size (119040).
 - While calling [Queue].WriteBuffer([Buffer "Dear ImGui Vertex buffer"], (0 bytes), data, (1887248 bytes))

I can see from the list of PRs, that there was this one: #22494 (WebGPU: support Premultiplied canvas output)

I suppose my code can be doing something wrong (but I did not change my code, just the emscripten version). Does anybody know if this PR introduced a regression?

ypujante commented 1 month ago

I just wanted to add that I built the demo (example_glfw_wgpu) that comes with ImGui latest version 1.91.3 and I am getting the exact same kind of error (so basically I have eliminated my code/layer)

(index):41 Validation error: Write range (bufferOffset: 0, size: 592656) does not fit in [Buffer "Dear ImGui Vertex buffer"] size (115240).
(index):41  - While calling [Queue].WriteBuffer([Buffer "Dear ImGui Vertex buffer"], (0 bytes), data, (592656 bytes))
(index):41 
28Write range (bufferOffset: 0, size: 592656) does not fit in [Buffer "Dear ImGui Vertex buffer"] size (115240).
 - While calling [Queue].WriteBuffer([Buffer "Dear ImGui Vertex buffer"], (0 bytes), data, (592656 bytes))
Understand this warning
(index):41 Validation error: Write range (bufferOffset: 0, size: 547616) does not fit in [Buffer "Dear ImGui Index buffer"] size (22528).
(index):41  - While calling [Queue].WriteBuffer([Buffer "Dear ImGui Index buffer"], (0 bytes), data, (547616 bytes))
(index):41 
> emcc --version
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.68 (ceee49d2ecdab36a3feb85a684f8e5a453dde910)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

> cd examples/example_glfw_wgpu
> make -f Makefile.emscripten serve
SenorBlanco commented 1 month ago

I came here to log a similar regression in another project. I don't have a reduced test case, but a git bisect tells me the regression was introduced here:

https://github.com/emscripten-core/emscripten/commit/d0d9966daa064eef3281495d9168a5efbef3f692

SenorBlanco commented 1 month ago

@sbc100 FYI

SenorBlanco commented 1 month ago

In my case, the error I get is:

Device error: Write range (bufferOffset: 0, size: 4193888) does not fit in [Buffer (unlabeled)] size (48).

Note that 4193888 is 0x003FFE60, which is exactly the "data" pointer passed at the C++ call site. So it's as if the formal arguments are offset by 32 bits, and "data" is falling into "size".

juj commented 1 month ago

The problem is with i64 legalization passthrough stub that is generated in Wasm side.

In WebGPU Shader Toy, the JS side gets generated as one would expect:

image

I added the highlighted line manually to debug/troubleshoot, and the browser prints

image

image

Indeed in the above prints, the large size= numbers are not the sizes that the C++ code passed, but these numbers are the data pointer parameter values. The numbers have shifted like mentioned before.

The offending part here is the function $legalfunc$wgpuQueueWriteBuffer that gets generated in the Wasm module.

The user code function that calls wgpuQueueWriteBuffer does so rather straightforwardly:

image

though then there is a legalization function that stands in between:

image

which calls the five parameter function

void wgpuQueueWriteBuffer(WGPUQueue queue, WGPUBuffer buffer, uint64_t bufferOffset, void const * data, size_t size);

as six parameters on the JS side, splitting up the 64-bit data integer into two 32-bit high and low parts. (this is the strategy we have had for a long while with 64-bit params)

If I manually change the JS side function to expect the two 32-bit high and low parts, that is:

image

Then the writeBuffer calls work correctly as expected. The run then errors out possibly on another instance of the same glitch:

image

I haven't followed the development of the recent automatic 64-bit value marshalling with foo__i53abi: true (PR #19711) and j type in foo__sig since I manage 64-bit pointers and integer values manually in my wasm_webgpu library, due to wanting to be hand-craftingly precise about the code that is emitted. E.g. wgpu_queue_write_buffer.

Reading the ChangeLog, it seems to me that this breakup into two 32-bit parts should not have happened, as it states:

image

Suggesting that instead of generating hi/lo parts like has happened here, adding foo__i53abi: true should have instead done a i64 -> Number conversion at the boundary? @sbc100 Did the internal implementation of foo__i53abi: true fault here? It looks like what has occurred is not according to what the above ChangeLog says (or maybe I interpreted the extent/conditions of the behavior described in ChangeLog incorrectly?)

If this hi/lo generation was expected, then should the foo__i53abi: true parameter mean Emscripten should generate a matching JS-side wrapper to combine the two i32 parts together? Or should the developer manually do that with the receiveI64ParamAsI53() function? (#iffing its use behind -sWASM_BIGINT=0 only perhaps?)


Nevertheless, I tried to quickly poke at the issue by enabling BigInt support via

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 2b92b2b..e514c9a 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -103,6 +103,7 @@ target_compile_options(wgpu_shader_toy PUBLIC

 target_link_options(wgpu_shader_toy PUBLIC
     "-fwasm-exceptions"
+    "-sWASM_BIGINT"
     "${wgpu_shader_toy_options}"
     "--js-library" "${CMAKE_CURRENT_LIST_DIR}/src/js/lib_wgpu_shader_toy.js"
 )

Though that doesn't directly work, as it will receive a BigInt truly in the JS side (as is expected), and library_webgpu.js is not geared up to receive these BigInts:

image

manually sidestepping that via a line

image

then runs into another BigInt error at

image

So more work in library_webgpu.js is needed to enable support for -sWASM_BIGINT build mode.

Hope that helps triage the issue further.

ypujante commented 1 month ago

@juj Thank you for taking the time to do a deep dive into the issue. I was fairly confident that it was not an ImGui issue, but it is pretty clear now that it is a regression in Emscripten. It definitely affects a lot of projects (I guess all projects using WebGPU...)

The workaround for now is to go back to a prior version of Emscripten like 3.1.66 which does work fine.

juj commented 1 month ago

Err, the bug here is as simple that the regressing PR https://github.com/emscripten-core/emscripten/pull/22557 in Emscripten added this if():

image

But the problem is that library_webgpu.js is processed before library_sigs.js where all the WebGPU function signatures live, so at the time that logic is parsed, the signatures don't exist, so the above sig?.includes('j') check is always false.

As a local workaround, one can copy all the wgpu signatures from library_sigs.js to library_webgpu.js:

diff --git a/src/library_webgpu.js b/src/library_webgpu.js
index e7f7ba55d..966caffad 100644
--- a/src/library_webgpu.js
+++ b/src/library_webgpu.js
@@ -1643,6 +1643,197 @@ var LibraryWebGPU = {
     });
   },

+  wgpuAdapterEnumerateFeatures__sig: 'ppp',
+  wgpuAdapterGetInfo__sig: 'vpp',
+  wgpuAdapterGetLimits__sig: 'ipp',
+  wgpuAdapterGetProperties__sig: 'vpp',
+  wgpuAdapterHasFeature__sig: 'ipi',
...
+  wgpuTextureGetDimension__sig: 'ip',
+  wgpuTextureGetFormat__sig: 'ip',
+  wgpuTextureGetHeight__sig: 'ip',
+  wgpuTextureGetMipLevelCount__sig: 'ip',
+  wgpuTextureGetSampleCount__sig: 'ip',
+  wgpuTextureGetUsage__sig: 'ip',
+  wgpuTextureGetWidth__sig: 'ip',
+  wgpuTextureReference__sig: 'vp',
+  wgpuTextureRelease__sig: 'vp',
+  wgpuTextureSetLabel__sig: 'vpp',
+  wgpuTextureViewReference__sig: 'vp',
+  wgpuTextureViewRelease__sig: 'vp',
+  wgpuTextureViewSetLabel__sig: 'vpp',

and the WebGPU Shader Toy starts to work again after that. @sbc100 maybe the jsifier needs a custom library sort order so that library_sigs.js gets processed very first? (also I don't remember exactly why couldn't all these signatures live in the respective library_x.js files themselves? That would be cleaner)

juj commented 1 month ago

Nice marble shader btw :)

image

sbc100 commented 1 month ago

(also I don't remember exactly why couldn't all these signatures live in the respective library_x.js files themselves? That would be cleaner)

This is because library_sigs.js is auto-generated, it is a lot simpler to just keep all the generated content seprate from the hand-edit content. For third party libraries it is still perfectly valid to include __sigs in the same file.