2D context zero copy from WebAssembly

penzn commented 4 years ago

Currently interfacing between Canvas and WebAssembly is done this way:

To Wasm
- get CanvasRenderingContext2D form Canvas
- get ImageData from - get CanvasRenderingContext2D form Canvas
- copy array into Wasm memory
From Wasm
- get CanvasRenderingContext2D form Canvas
- Create ImageData from an array slice from Wasm memory
- Set ImageData in CanvasRenderingContext2D

There is a WebAssembly proposal to for mutiple memory access, which would allow web assembly modules accessing more than one memory object. If CanvasRenderingContext2D or Canvas would export a Wasm Memory object, then a lot of copying and instantiation described above can be eliminated.

WebAssembly/multi-memory#7

annevk commented 4 years ago

It seems more natural to me to offer getImageData() variant that takes a view into which bytes can be written. At least thus far we've avoided a dependency on Memory for equivalent scenarios.

penzn commented 4 years ago

It seems more natural to me to offer getImageData() variant that takes a view into which bytes can be written. At least thus far we've avoided a dependency on Memory for equivalent scenarios.

Can you please elaborate on view into which bytes can be written? On Canvas side? Memory is a byte array at its core and is accessed in a deterministic way.

JSmith01 commented 2 years ago

Is there any progress for getImageData() to use buffer instead of creating/allocating a new one? This could drastically reduce GC use, especially when combined with SharedArrayBuffer. It would be nice if it supports something like getImageData(sx, sy, sw, sh, buffer) (with one more exception to be thrown when no enough space for the frame in a provided buffer).

domenic commented 2 years ago

/cc @whatwg/canvas . This seems like a pretty easy win if someone wants to take the time to do spec/tests/implementation.

junov commented 2 years ago

IMHO, it would be best to map the canvas's backing store to a buffer at context creation time. Otherwise it's hard for the browser to know from the start that it should use VM-accessible memory. Also this does not need to be tied to WASM. We could do something that also works well in JS.

Something like:

let buffer = new Uint8ClampedArray(w*h*4);  // Any type of ArrayBuffer or ArrayBufferView should work
canvas.width = w
canvas.height = h
let ctx = canvas.getContext('2d', {storage: buffer});  // Throws if buffer is too small.

When the context is created in this way, the browser would know to stay away from asynchronous rendering to prevent synchronization issues between canvas commands and direct buffer access.

I would also suggest that the size of the canvas should be immutable after context creation.

WDYT?

junov commented 2 years ago

To hit two birds with one stone, the 'storage' context creation attribute could also point to a WebGL texture, for zero-copy render to texture. Anyways, that's a whole other feature so let's not get too deep into that here.

JSmith01 commented 2 years ago

IMHO, it would be best to map the canvas's backing store to a buffer at context creation time. Otherwise it's hard for the browser to know from the start that it should use VM-accessible memory. Also this does not need to be tied to WASM. We could do something that also works well in JS.

Something like:
let buffer = new Uint8ClampedArray(w*h*4);  // Any type of ArrayBuffer or ArrayBufferView should work
canvas.width = w
canvas.height = h
let ctx = canvas.getContext('2d', {storage: buffer});  // Throws if buffer is too small.
When the context is created in this way, the browser would know to stay away from asynchronous rendering to prevent synchronization issues between canvas commands and direct buffer access.

I would also suggest that the size of the canvas should be immutable after context creation.

WDYT?

I suspect that would require much more efforts to implement (at least for Chromium). And this would also mean canvas context has slightly different behavior and it might also hit performance when developer rarely needs to access a frame buffer. So the idea is nice, but it's rather something different from reusing getImageData (that's not only about WASM, but could be used also from a plain JS).

kdashg commented 2 years ago

These are very cool ideas! Indeed, this is what we end up doing in our internal CPU-side APIs for these same perf reasons! The lowest-hanging fruit here is adding a getImageData(..., buffer, offset) I think! That would give us one-copy downloads from canvas2d.

kenrussell commented 2 years ago

Could some more use cases be described?

Allowing getImageData to specify a destination buffer seems like a good direction to take regardless - it would eliminate an allocation every time the application wants to fetch the canvas's backing store.

Is it strongly desired to be able to write to the canvas's backing store from WebAssembly with zero copies? CanvasRenderingContext2D implementations are preferentially GPU-accelerated in multiple browsers nowadays, so specifying a backing store region would imply fallback to CPU rasterization. There would also be a necessary copy by the browser's compositor on the way to the screen, to avoid displaying incomplete rendering results. Is putImageData and the copy it implies too costly for some applications?

kdashg commented 2 years ago

There's definitely a desire for cpu-rasterized canvas work with zero-copy. We use something similar internally in Gecko fairly frequently. GPU-accel is mostly needed for screen-sized buffers, but smaller parts are often better suited to CPU.

junov commented 2 years ago

The main issue I see with the getImageData approach is that In Chromium we won't be able to implement a zero-copy version of that. By default 2d canvas rasterization happens asynchronously, usually on the GPU, and soon it will even be out-of-process. We maintain a software rendering option that can be activated with canvas.getContext('2d', {willReadFrequently: true}). The 'willReadFrequently' attribute is a hint to the browser that it should optimize for readbacks (i.e. getImageData), and since GPU readbacks are painfully slow this implies rendering on the CPU. However, even with synchronous in-thread CPU-based rendering, we cannot provide a zero-copy implementation of getImageData because the backing store is in memory that is not accessible to script or WASM, so a copy is still required. Providing a storage buffer at context creation time would allow for a true zero-copy implementation, but as Ken stated, we'd still need to make a copies for display purposes. These copies would happen once per animation frame, only if the canvas contents have changed, and only if the canvas is visible on screen. The big advantage I see with the storage buffer approach is that it simultaneously provides zero-copy read and zero-copy write access. User code could seamlessly interleave 2d canvas API calls with direct pixel reads and writes, without any need to ever call getImageData or putImageData (you just access the buffer directly), so there would be less API call overhead, and without any buffer copying until all drawing is done and it is time to display.

If you only intend to call getimageData once per animation frame then the two proposals (getImageData vs. script-provided storage) are equivalent in terms of number of buffer copies. But as soon as you need to make more than one call to either getImageData or putImageData, the storage buffer approach wins in terms of reducing the number of buffer copies. Also, if the canvas is not displayed (e.g. used for background rendering), the storage buffer approach wins.

Also worth noting, both of these approaches avoid creating large temporary objects that would need to be garbage collected (compared to the current API that creates ImageData objects). That's nice.

JSmith01 commented 2 years ago

For JS apps I think it would be ok to have one-copy getImageData API. It solves only one problem - unnecessary memory allocation on every call of it (and as a consequence higher load for GC). It's nice if at some point we'd be able to have some kind of zero-copy API, but as @junov stated - it might be very tricky to have it even designed (it'd be a shared memory between browser internals and JS/WASM - that I find to be quite dangerous).

projektorius96 commented 1 year ago

For JS apps I think it would be ok to have one-copy getImageData API. It solves only one problem - unnecessary memory allocation on every call of it (and as a consequence higher load for GC). It's nice if at some point we'd be able to have some kind of zero-copy API, but as @junov stated - it might be very tricky to have it even designed (it'd be a shared memory between browser internals and JS/WASM - that I find to be quite dangerous).

Could it be achieved doing from scratch ? Re-writing specification that can later be used with WASM in a proper sense ? I am thinking about Go + Skia (bindings is my issue) or C + Skia directly ?

whatwg / html

2D context zero copy from WebAssembly #5173