emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.84k stars 3.31k forks source link

Improved heap growth check in pthreads with memory growth enabled? #12783

Open juj opened 4 years ago

juj commented 4 years ago

Views to WebAssembly.Memory seem to have this peculiar behavior, that if one .grow()s a memory, all the existing views in the same thread do become zero-length, but all the existing views to the Memory in other threads preserve their unresized lengths. Is that something that is mandated by the Wasm/SAB spec? (CC @lars-t-hansen)

If so, it seems that we could do much better with the whole "need to check in JS if heap was resized before a heap access" scheme.

For example, if a function performs several heap accesses, like

  $writeI53ToI64: function(ptr, num) {
    HEAPU32[ptr>>2] = num;
    HEAPU32[ptr+4>>2] = (num - HEAPU32[ptr>>2])/4294967296;
  },

We would not need to replace all the accesses with GROWABLE_HEAP_U32() checks, just doing the check once in the function start would be enough.

Actually, in many cases, we would not need to do the check in the function writeI53ToI64() at all, but we could lift it to the calling function to have to do the check. That would be beneficial if the calling function does many writeI53ToI64() calls in a row.

Applying that logic transitively, we would need to only emit checks to a) in the beginning of each JS function that is an import to Wasm that does a heap access, and b) right after calling any Wasm export function from a JS function that does a heap access (e.g. _malloc())

Would this be something we'd be apply to the acorn optimizer pass? The tricky part is that we would have to transitively analyse program flow to find which functions do perform heap accesses.

Alternatively, we could apply annotation to JS function imports, like e.g.

  glCreateProgram__heapAccess: 'none',
  glCreateProgram: function() {
    var id = GL.getNewId(GL.programs);
    var program = GLctx.createProgram();
    program.name = id;
    GL.programs[id] = program;
    return id;
  }

where maybe we could support none (developer will manually insert any heap growth checks in the JS code if needed), or entry (add a heap check in the beginning of the function), or all (add a heap check to every heap access).

With either such an annotation scheme, or automatically doing control flow analysis, it seems that the performance impact of needing to do manual heap growth checks could be reduced to be negligible?

lars-t-hansen commented 4 years ago

I need to look at the spec + code to be sure, but that's really strange behavior. IIRC, existing views in the same thread should retain the original length. (Since you say "other threads" I assume we're only talking about shared memories here.)

juj commented 4 years ago

Hmm, thanks for clarifying. That prompted me to retry the behavior on the same thread again, and I do see the view staying the same:

test

Maybe I botched something up in the first test..

But good to hear that retaining the view to the unresized length is guaranteed, so we should be able to greatly improve the codegen for heap resizes. @kripken what is your thought?

kripken commented 4 years ago

It might be even simpler to instrument the wasm imports and exports to do this check once on each entrance/exit from the wasm. We have a few other such instrumentations, see e.g. library_async.js.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.