codefrau / SqueakJS

A Squeak Smalltalk VM in Javascript
https://squeak.js.org
MIT License
371 stars 76 forks source link

High-Performance JIT #121

Open codefrau opened 3 years ago

codefrau commented 3 years ago

I outlined some ideas here https://squeak.js.org/docs/jit.md.html

Comments or help welcome.

ccrraaiigg commented 3 years ago

Reading...

pavel-krivanek commented 1 year ago

A silly idea. What about trying to use loops with break/continue for emulation of GOTO? It will not work in all cases, but for most methods yes, probably. And it will be easier to optimize for JS engines. The complex ones can then still use the current JIT.

loop1:
while(1)
{
    // 0 <00> pushInstVar: 0
    // 1 <D0> send: #selector
    // 2 <A8 02> jumpIfTrue: 6
    if (condition) break loop1;
    // 4 <A3 FA> jumpTo: 0
    continue;
}
    // 6 <21> pushConst: 42
    // 7 <7C> return: topOfStack
codefrau commented 1 year ago

It would be interesting to benchmark the difference – my guess is that it won't matter much, if at all. We also have to weigh the simplicity of the current JIT (which basically has a static template string for each byte code) against the complexity of an optimizing one. But if it is a good speedup without adding too much complexity it might be worth it.

What really would make a huge difference is optimizing sends, which is what my doc talks about. I should update it some time.

I have done some more experiments, e.g. I extended the exception-based JIT mockup with blocks and non-local returns: https://codepen.io/codefrau/pen/YzNWpxO

That JIT still fully unwinds when there is a context switch though, and has to fall back to the interpreter to resume a method. I'm excited about the technique outlined in this paper which would allow recreating the native JS stack when resuming: https://arxiv.org/abs/1802.02974 Building a mockup of this would be exciting, but I have no cycles to spare for this in the near future.

Would be great to get some feedback on the mockups I built though :)

ccrraaiigg commented 1 year ago

Sorry for the long delay! I think those mockups are very promising, but I've changed my focus to WebAssembly. As a proof of concept, I wrote a WASM version of the SqueakJS interpretOne(), which worked. I'm currently writing a JS-to-WASM translator for the rest of the VM. Since WASM modules can import functions from other WASM modules, I plan to use WASM for just-in-time compilation too. The overhead of calling JS from WASM is high; I'm measuring the overhead of WASM-to-WASM calls. (However slow it might be now, though, I'm expecting web browsers to make it faster over time.)

codefrau commented 9 months ago

I've added some more detail to that document, in particular performance mockups with various optimizations (search for "jit-perf.js" if you can't find the section). You can compare these to benchFib in Cog.

Seems quite promising, I think speeds within an order of magnitude of Cog are possible.

ErikOnBike commented 7 months ago

Hi Vanessa. @codefrau You have been quite active on SqueakJS recently. Could we pick JIT up as well? I'm eager to participate and get high performance on Sista images (so preferably going for the the unsafe, aka non-type checked version for my use case). Not sure if you have made a decision on which optimisation version to use (or whether to further investigate options, incl. WASM) and how to start implementing it. Hi Craig. @ccrraaiigg Is the WASM route an even better alternative or is support/performance (incl. the mentioned overheid switching JS/WASM) not 'up to speed' yet? I'm mostly interested in providing higher performance right now and don't mind a little investment which in the future will be obsolete by better options all together. (That's what programming is all about, ain't it ;-).

ccrraaiigg commented 7 months ago

Hi @ErikOnBike -- I do still think WASM will be the better option. Will you be calling out to JS libraries a lot? What's your current benchmark test? Cheers!

ErikOnBike commented 7 months ago

Hi @ccrraaiigg it is for CodeParadise. It does a lot of calling out to JS, because it is constantly manipulating the DOM. That's kind of the only thing CodeParadise is or should be doing (being a UI framework) :-).

ccrraaiigg commented 7 months ago

@ErikOnBike Aha, if it's only manipulating the DOM (or other built-in browser functionality) and not actually calling some third-party JS library, then the story may be a lot better. There's still a lot of work left to do for a usable general-purpose OpenSmalltalk virtual machine, though. But I'm always interested in early benchmarks to give an idea of likely near-term outcomes. I imagine I'll add a test of just adding and removing a div...

codefrau commented 7 months ago

Hehe, yes, I had a lot of fun getting OpenGL to work, and then got frustrated trying to implement a TCP/IP stack. It does work though.

Re JIT, I’m still more excited about getting closer to JS while maintaining full ST semantics. Compiling the Stack interpreter to WASM is certainly possible but I personally don’t really find that very interesting.

As for SISTA, it is currently a lot slower than the traditional bytecodes because it doesn’t even use my simple JIT. I just opened #157 for that, it should be fairly simple to build. Implementing the high-performance JIT I’m imagining in this ticket is a lot more involved.

codefrau commented 3 months ago

I implemented the simple SISTA JIT a while ago: 2362a227eceb7d6d9f1742cdad0eab0e61fcb902

codefrau commented 3 months ago

The high-performance JIT requires major changes to the system. Follow pull request #168 and the v2 branch for progress.