capr / blag

:suspect: Blog/wiki of a grumpy old programmer
4 stars 0 forks source link

The myth of JIT compilers #19

Open capr opened 3 years ago

capr commented 3 years ago

JIT compilers can be as fast as C

I have to admit, I've been swallowing this blatant lie whole for far too long. To be fair (to me), there's a lot of reasons why someone might fall into this trap. One reason is the ubiquitous human bias of wishful thinking: you just want it to be true because coding in a dynamic language is so much more fun and easy than in any static language so wouldn't it be nice if you wouldn't have to touch those static languages ever again? Another reason is that everybody on the Internet is doing micro-benchmarks and showing those beautifully melted traces, and what more evidence do you need, right? It's empirical, so it's proven.

To be clear, I'm not talking about LLVM-based JIT compilers like Julia's -- that's merely doing type specialization at runtime. I'm talking about JITs of real dynamic languages like Lua and JavaScript, and specifically tracing JITs because those are the only ones that claim these speeds anyway.

What nobody mentions about tracing JITs (and the first thing you'd think they'd tell you about them because you'd surely like to know this) is that they are choke-full of heuristics that control when and what to compile, when to bail out, and what to blacklist to avoid attempting to compile again. These heavily probabilistic compilers are very easy to trip and get a speed drop of 10-100x in the middle of your program and good luck fixing that by looking at IR traces.

There are two distinct problems here: the first one is that performance can drop dramatically based on not only how your program is written but also how the program runs. The second is that you don't know how to fix it. And that's the crux of the matter: you're using a performance tool to get a handle on performance, but performance is not part of the API. There's no official guide with all the transformations, optimizations and heuristics and how they interact with each other so you can make a mental model of how all this works. It's all tribal knowledge scattered around in blog posts and presentations. But you need more than just documentation: if you really want to make performance controllable you have to design for that which means solving the first problem too to the extent possible (i.e. make performance more reliable) and give the user enough tools to do the rest, and by tools I don't mean diagnostic tools, I mean hints and directives to control the compiler and how it interprets the code. And no, that doesn't have to mean a type system.

It's the same with SQL

This problem is shared by all probabilistic optimizers, not just JIT compilers. Take SQL for instance. The query planner is not smart enough to make all your queries run optimally every time. So there's thousand-page books about how to turn your queries into an unmaintainable mess in order to change how the optimizer sees them so they can run fast (I had a colleague at a previous job specialized in just that). Does that look like a good solution? Because that's exactly what JavaScript and Lua programmers do to satisfy their respective JIT compilers today.

A particularly ridiculous example

V8 automatically inlines code when parsing functions whose body contains less than 600 characters, including whitespace and comments, so if you put long comments inside your sugar/utility functions they might not get inlined.

It's not the technology, it's people

There's something about this situation that is worse than just some tech that needs improving. There's a prevailing mentality among programmers that actively prevents this particular kind of tech from improving. And it's a mentality shared by JIT compiler people, SQL query planner people (PostgreSQL devs are particularly dogmatic about this), LLVM optimizer people, CPU microarchitecture people, and probably everyone who ever designed a predictive tool for anything. And that is the "sufficiently smart compiler" fallacy that stems from the unstated belief that if their tool ever includes instructions or hints to control the optimizer in any significant way, that is somehow an admission of failure of the tool and by extension, of them.


More reading: GDC 2017 Web Tools Postmortem -- check out the slide about V8.