paritytech / bench-bot

ISC License
9 stars 19 forks source link

Decrease command execution overhead #47

Closed joao-paulo-parity closed 3 years ago

joao-paulo-parity commented 3 years ago

In https://github.com/paritytech/bench-bot/pull/46 I had introduced output streaming so that the commands' outputs could be seen in the log; this is useful for following the progress of a given command. In that PR I switched away from shelljs' execution because it was throwing nonsensical errors related to temporary files. The execution was refactored to use spawn which seemingly adds a lot of overhead to the commands (this issue could to be related).

shelljs seems to perform much better than spawn; however, we would like to have a way of streaming the output into the logs without hindering performance. How to have both?

1. From the shelljs source we can see that they're using execFileSync which apparently performs much better; through this PR we'll execute the commands the same way. 2. To avoid having Node.js process the commands' streams we'll use stdio: "inherit" which will forward them to the parent process (i.e. they'll be output to the console, therefore showing up in the logs). 3. Since streams are not captured according to 2. but we still want to collect stdout for parsing the benchmarks' results, the output is redirected to a file with tee and then read after the command finishes.

shawntabrizi commented 3 years ago

thanks for looking into this <3

joao-paulo-parity commented 3 years ago

So far I've been unable to pinpoint exactly what is causing those wild variations in the result.

I've triggered two runs with ee7bf897d75d90dae0e3d7fcd8d68c85edd4fefc and the first one looks better but then it gets bad again.

Ultimately I don't see a good argument for no reverting the code to that state without first documenting what exactly is causing this problem, because it'll come up again in the future. I'll continue the work here a bit later.

joao-paulo-parity commented 3 years ago

At the moment it's inconclusive how much this matters. From the comparisons in https://github.com/paritytech/substrate/pull/9332 it seemed like going away from spawn was on the right track, although most of the overhead might be (also) be coming from running the bot as a service or starting the command through a shell (source) since it seems to perform better when the command is executed manually in the machine.

I'll close this as unresolved and come back to it later. The bot will be running again through @shawntabrizi 's session.