emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.47k stars 3.26k forks source link

emrun stdio performance #15839

Open arsnyder16 opened 2 years ago

arsnyder16 commented 2 years ago

emrun stdio performance

I am compiling a legacy native library that also includes a large number of gtests. When running the gtest they write a lot to the stdio under emrun this is a performance bottleneck.

This bottle neck is created by emrun writing the stdio 3 places. the dom,console, and also posts an http request to the server.

Even eliminating the dom doesn't help because the writes can come so quickly the chrome will error because of so many requests being created.

In my example if i run the gtest wasm with --gtest_list_tests which does not run anything but lists the names of all the tests in the executable takes ~3 minutes with default emrun behavior.

By adding a simple --extern-post-js I am able to make the this take 1-2 seconds.

I am not sure if others have run into the stdio performance or not but this could be a useful feature to allow users to specify a stdio buffer limit

var process = cb => {
   var lines = [];
   var limit = 1000;
   return (text, flush) => {
      lines.push(text);
      if(flush || lines.length > limit) {
         cb(lines.join('\n'));
         lines = [];
      } 
   };
}
out = process(out);
err = process(err);
addOnExit(() => { 
   out('', true);
   err('', true);
});
sbc100 commented 2 years ago

I've recently been thinking it would be nice to have he stdout and stderr of the our browser tests proxyied back to the test hardness (currently you need to look in the browser console to see the stdout/stderr).

In general, where possible, I would strongly advice running your unit tests under node. For example we run a fair amount of the emscripten tests under node on the command line, avoiding the browser completely.

However, it would be good do better here.

When you run --gtest_list_tests and it takes 3 minutes (!), is that because you have thousands of tests to list? (i.e. how many lines of stdout are being generated exactly?). From you description is sounds like its really the proxying back to the server via http requests that is the issue? Perhaps we could/should address that one in particular, while still allowing realtime console.log and html DOM updates?

arsnyder16 commented 2 years ago

--gtest_list_tests in my case produces ~ 4100 lines of output (241kb of data). I prefer to run my tests in the browser to replicate production environment. The http request is certainly the biggest bottleneck and keep in mind that with lots and lot of concurrent requests chrome will only send 6 at once. The console/dom processing does complete first but it still takes ~2-2.5 minutes for that to finish then the server lags another ~30 seconds until everything is sent.

If i eliminate the dom/console writing the requests it only takes <5 seconds for the process to complete but with so many requests queue up that fast chrome automatically errors a lot of them and doesn't send them through.

image

So the dom/console behavior actually slows the process down enough to let all requests through. I think the main bottleneck is the dom updating because the page is forcing a dom and scroll update on every line written

element.value += text + "\n";
element.scrollTop = element.scrollHeight; // focus on bottom

In my case i am running headless chrome on a CI machine so i could care less about console/dom updates but that is obviously specific to my case

sbc100 commented 2 years ago

Presumably its up to you if you want to display anything in the DOM? I guess you are using the default shell.html which has a little console display? Would supplying a simpler/smaller/headless --shell html file solve at least part of this issue?

Regarding the http requests it does sounds like we need a some kind of buffer option to emrun.

sbc100 commented 2 years ago

If its useful enough perhaps we could include a non-logging non-displaying headless shell.html in emscripten itself ? I imagine it could be just a couple of lines? Or do your tests rely on access to a canvas?

(Regarding testing strategy, I agree it good to run tests in the browser, but its also a lot slower and more flaky (in our experience anyway) so for development and fast iteration I think its good to have some tests that can run under node too (at least some smokescreen tests that you can run in your tight development cycle).. but this is not something that we want to dictate of course)

arsnyder16 commented 2 years ago

I think everything you proposed here sounds great, with a few options things can be mixed and matched. One idea problem that the buffer does create is it does interfere with silence_timeout, but IMO it would make more sense to make that behavior to related to stdio but rather have the js have a setInterval that periodically pings the server maybe every (silence_timeout / 2). Another options would be to have the setInterval flush the io buffer too

arsnyder16 commented 2 years ago

Also keep in mind that depending on what features of WebAssembly are being used node may not be an option since it depends on feature support level (exception handling etc)

sbc100 commented 2 years ago

What about streaming over a websocket instead of making repeated requests? I wonder if that would be performant enough?

sbc100 commented 2 years ago

Also keep in mind that depending on what features of WebAssembly are being used node may not be an option since it depends on feature support level (exception handling etc)

For those cases, in emscripten core anyway, we rely on the d8 shell to do the testing rather than node (this is easy to install via jsvu and give you features that are even more recent than any browser I think). But of course there are some features such as DOM and audio that we simply can't test at all outside of the browser. I personally dislike working on those because it makes debugging a testing so much more painful.

arsnyder16 commented 2 years ago

Websocket is a great idea, reducing the connections that need to happen should solve this issue i would imagine