beeware / batavia

A JavaScript implementation of the Python virtual machine.
http://pybee.org/batavia
Other
1.39k stars 424 forks source link

Run a node server to execute tests #783

Closed martica closed 5 years ago

martica commented 5 years ago

The existing test infrastructure takes long time to complete the suite. Profiling the python side showed most of the time was spent in the process call to JS. It launches the node executable for each test. Launching an executable comes with some overhead.

This PR launches a single node executable at the start of the suite. That node executable runs a local server that waits for POST requests with the content of the JS code to be tested and runs it in a node sandbox.

My initial benchmarking shows significant improvement. A test subset of ~3500 tests drops from 312s to 110s, a reduction of about 65%.

PR Checklist:

martica commented 5 years ago

The 3.4 smoke test here took 2 minutes to run instead of 8 in my previous pull requests.

phildini commented 5 years ago

I think this is a fascinating direction, and look forward to seeing the final version when you're ready. Did a first pass, I didn't see any major red flags, and I'll take a closer look when everything is passing.

martica commented 5 years ago

Some of the failures are in master currently. I have another PR to mark things that started failing when danyeaw updated the node modules as expected failures for now to keep other PRs green.

There are a couple tests that act differently under this harness than before. I'm just isolating those at the moment to inspect them.

A full test suite on my local machine that used to take over half an hour just finished in seven minutes.

martica commented 5 years ago

So, the issue with this approach is that when a test is run with additional JS modules, loading those modules modifies the object that was created with require('...batavia.js').

I'm looking at alternatives:

  1. find a way to have node drop the batavia.js module from the require cache and reload it
    • Using the approaches people suggest (delete require.cache[path]) appears to crash node.
  2. namespace the modules in every test call so that they don't collide
    • Ugly and just hides the problem
  3. ensure that any caching that batavia does happens in the batavia.VirtualMachine and not in the imported node module.
    • Each batavia.VirtualMachine could have a local modules cache that is checked before the global one and the global one would stay constant.

I think 3 is probably the right answer. If I understand what is happening correctly, currently, when you require batavia and create two batavia.VirtualMachine instances, they will interact with each other.

martica commented 5 years ago

Using vm.runInNewContext changes something about the environment so that the calling conventions are different. When I run tests that create a native module with a JS class, and call methods on instances of that class with multiple parameters, all the parameters end up as an array in the first of the JS methods, the second gets a dictionary and any later ones are undefined.

If I reset batavia.modules.sys.modules to {} before each test and use eval() instead of the vm.runInContext, everything that was passing at head still passes.

martica commented 5 years ago

@phildini

I think I'm at a good point to take a look. Everything failing now is already failing at head.

martica commented 5 years ago

Ugh. The CI tests do a different trick to work around the compiling of batavia.js being a race condition when using multiple threads. I think I can fix that too.

martica commented 5 years ago

Everything is passing now that my other PR landed. All the CI stuff ran in about five minutes after the last push.