emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.37k stars 3.25k forks source link

Signed shift on memory access in library_webgl.js when using 4GB support #20137

Open landersonfigma opened 10 months ago

landersonfigma commented 10 months ago

When using 4GB support, my understanding is that memory accesses are rewritten to use unsigned shifts. In library_webgl.js, glUniform4fv is rewritten from:

value >>= 2;
for (var i = 0; i < 4 * count; i += 4) {
  var dst = value + i;
  view[i] = heap[dst];
  view[i + 1] = heap[dst + 1];
  view[i + 2] = heap[dst + 2];
  view[i + 3] = heap[dst + 3];
}

to:

value >>= 2;
for (var i = 0; i < 4 * count; i += 4) {
  var dst = value + i;
  view[i] = heap[dst >>> 0];
  view[i + 1] = heap[dst + 1 >>> 0];
  view[i + 2] = heap[dst + 2 >>> 0];
  view[i + 3] = heap[dst + 3 >>> 0];
}

The memory accesses inside the loop look correct but should value >>= 2 be value >>>= 2? With value >>= 2, I get memory accesses that read out of bounds. Possibly the same issue here. Let me know if I'm misunderstanding the expected output.

sbc100 commented 10 months ago

My guess is that the auto-rewriter is that not that smart and can only reason about expression inside of the HEAPXX[...] accessors.

The tools is here by the way: https://github.com/emscripten-core/emscripten/blob/0a6f54b1276481fab90e474f2c4fd35d00c5fa6b/tools/acorn-optimizer.js#L1235-L1238

@kripken is that right? I don't suppose there is any way to know that value >>= 2 happens to be a pointer value?

In general I think the solution here is to use value = {{{ getHeapOffset('offset', 'i32') }}}.

kripken commented 10 months ago

Yeah, the pass has a little magic but that's it:

https://github.com/emscripten-core/emscripten/blob/0a6f54b1276481fab90e474f2c4fd35d00c5fa6b/tools/acorn-optimizer.js#L1239-L1244

Aside from the standard heap names + those two there, it doesn't do any guessing.

Renaming view to heap would enable the magic, so that is an option to fix this. And more generally there is getHeapOffset as mentioned.