stdout not suitable for heavy data streaming

jozefchutka commented 2 years ago

I would like to process GBs of data streamed by wasm app (stdout, stderr, print, printErr) consumed by JS. Preferably JS app receives whole ArrayBuffer/bytes from stdout with each callback, so there is no overhead in terms of loops, converting etc. However, emscripten doesnt seem to be designed for such a flow esp. considering performance. Following is a little research I made with current state of emscripten 3.1.8.

Module.print/printErr

stdout -> memory (ArrayBuffer) -> read by byte in loop -> UTF8ArrayToString -> print:callback(string)

Observed seems that wasm stdout, is first stored into memory, iterated by byte, converted to utf and finally recieved. A whole line is eventually received by a callback as a string. This is too much overhead for my case not to mention I am expecting bytes.

Module.stdout/stderr

stdout -> memory (ArrayBuffer) -> read by byte in loop -> stdout:callback(singleByte)

Observed seems that wasm stdout, is first stored into memory, iterated by byte, a single byte is eventually received by a callback. This is still much overhead esp. I need to merge bytes back to ArrayBuffer and process.

Customized/Hacky Module.stdout/stderr

stdout -> memory (ArrayBuffer) -> stdout:callback(arrayBuffer slice)

Applying modification from https://github.com/emscripten-core/emscripten/issues/16108#issuecomment-1100349079 a whole slice of ArrayBuffer can be delivered at once.

This is as far as I managed to get in terms of performance optimization. I would like to propose the following:

provide a new API, so one can attach callback directly on the wasm stdout, so writing data into memory is completely avoided. This will have benefit in terms of performance as well as dedicated memory not being consumed. I am not sure how suitable this with threads etc.
or, extend current API (maybe through mounting custom stdout?) so stdout callback works with ArrayBuffer slice instead of a single byte per callback. This will improve performance and the need to post-process the generated JS file by a hack.

sbc100 commented 2 years ago

Indeed I don't think emscripten's stdin/stdout system was ever designed for anything other than simple debugging/logging.

If you want to do real work over stdin/stdout then some refactoring would be needed. However, I would suggest that if you want to shift large amounts of data into our out-of emscripten you avoid stdio and libc file handles completely and use of the many other ways to communicate between JS and native code. See https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html.

One could argue that stdin/stdout streaming is not the right model for passing large amounts of data between JS and wasm, since JS has direct access to the heap and stdio streams is designed to streaming data into and output of the address space. Having said that, if we can come up with an unintrusive way to make stdio stream work better for your purposes I don't see why we could land them.

jozefchutka commented 2 years ago

Hi @sbc100 , thanks for the ideas. I will look into these.

What do you think of having stdout passing data directly into a JS callback without heap memory being involved at all?

sbc100 commented 2 years ago

Hi @sbc100 , thanks for the ideas. I will look into these.

What do you think of having stdout passing data directly into a JS callback without heap memory being involved at all?

The only data that C/C++ can access the heap memory (and function arguments, but they can't be used to transfer anything but basic integers/floats), so the heap will always be involved. The best you can do is try to avoid copying by having the JS code directly access the heap data via DataViews (e.g. the default HEAP32, HEAP16, HEAP8, data views the emscripten provides).

jozefchutka commented 2 years ago

hm, in case heap needs to get involved I think what is described in "Customized/Hacky Module.stdout/stderr" section is as fast as it could be. Please consider adding such api in emscripten

sbc100 commented 2 years ago

That fastest way is to completely avoid the copying and your JS code access the heap directly.

Anything that uses stdio is most likely going to involve copying. For example, there is buffering in libc if you use fread and fwrite and the lower level read/write APIs are not designed for zero-copy/heap-sharing data transfer that you want with JS/native interactions.

If you really want to use stdout/write to move data to JS efficiently and you have ideas for how to speed it up, feel free to propose a change.

jozefchutka commented 2 years ago

You are right. I can not think of any other/faster way then direct access to heap.

My API proposal is an alternative stderr/stdout callback which would provide access to heap + poitners so dev can decide what to do with that (most likely copy). Something like the mentioned hack.

Do you think there is a chance for implementation? Where is the right place to propose such API change?

sbc100 commented 2 years ago

My inclination is still to say that write/fwrite to stdout is not the best API for communicating large amount of data to JavaScript. I would suggest it would be easier just to write you own API.. there are many options: https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html.

If you really want to push forward with using stdout then you can propose a new API for intercepting the output and we can discuss more there perhaps?

jozefchutka commented 1 year ago

Related to this topic, with similar solution ( https://github.com/emscripten-core/emscripten/issues/16108#issuecomment-1100349079 ) I managed to create and use "named pipe"s. Just in case someone is interested.

In generated myModule.js replaced

try{for(var i=0;i<length;i++){stream.tty.ops.put_char(stream.tty,buffer[offset+i])}}

by:

try{if(Module.onTTY && Module.onTTY(stream,buffer,offset,length,pos))var i=length;else for(var i=0;i<length;i++){stream.tty.ops.put_char(stream.tty,buffer[offset+i])}}

Now I can create and use pipes:

const module = await createMyModule({
    onTTY:(stream,buffer,offset,length,pos) => {
        if(stream.path === "mypipe") {
            const data = buffer.slice(offset, offset + length).buffer;
            console.log(data); // hello there
            return true;
        }
        return false;
    }
});

const tty = module.FS.stat("/dev/tty");
module.FS.mknod("mypipe", tty.mode, tty.rdev);
module.callMain(...); // program that writes into "mypipe" file

Interestingly stdout and custom pipes (like mypipe) needs to be intercepted in different places in myModule.js.

It would be nice if emscripten can provide some elegant build in API for such interceptions.

If there is already better solution I am missing please share.

emscripten-core / emscripten