microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
100.57k stars 12.43k forks source link

Ideas for faster cold compiler start-up #25658

Open DanielRosenwasser opened 6 years ago

DanielRosenwasser commented 6 years ago

Background

For some users, cold compile times are getting to be a bit long - so much so that it's impacting people's non-watch-mode experience, and giving people a negative perception of the compiler.

Compilation is already a hard sell for JavaScript users. If we can get some speed wins, I think it'd ease a lot of the pain of starting out with TypeScript.

Automatic skipDefaultLibCheck

lib.d.ts is a pretty big file, and it's only going to grow. Realistically, most people don't ever declare symbols that conflict with the global scope, so we made the skipDefaultLibCheck (and also the skipLibCheck flags) for faster compilations.

We can suggest this flag to users, but the truth is that it's not discoverable. It's also often misused, so I want to stop recommending it to people. 😄

It'd be interesting to see if we can get the same results of skipDefaultLibCheck based on the code users have written. Any program that doesn't contribute a global augmentation, or a declaration in the global scope, doesn't really need to have lib.d.ts checked over again.

@mhegazy and I have discussed this, and it sounds like we have the necessary information after the type-checker undergoes symbol-merging. If no symbols ever get merged outside of lib files, we can make the assumption that lib files never need to get checked. But this requires knowing that all lib files have already had symbols merged up front before any other files the compiler is given.

Pros

Cons

V8 Snapshots

~3 years ago, the V8 team introduced custom startup snapshots. In that post

Limitations aside, startup snapshots remain a great way to save time on initialization. We can shave off 100ms from the startup spent on loading the TypeScript compiler in our example above (on a regular desktop computer). We're looking forward to seeing how you might put custom snapshots to use!

Obviously my machine's not the same as that aforementioned desktop, but I'm getting just a bit over 200ms for running tsc -v, so we could possibly minimize a decent chunk of our raw startup cost. Maybe @hashseed or @bmeurer would be able to lend some insight for how difficult this would be.

Minification

@RyanCavanaugh and I tried some offhand loading benchmarks with Uglify and managed

  1. to reduce typescript.js's size by about half
  2. to reduce import time by about 30ms

I don't know how impactful 30ms is, but the size reduction sounds appealing.

hashseed commented 6 years ago

The numbers for the blog post were, as you noted, from 3 years ago. I tested it on the Typescript compiler that was part of the Octane benchmark. I'm sure these numbers are totally outdated, but I'm also sure that the benefits are still very significant.

Unfortunately the steps I used only work on vanilla V8. Node.js currently doesn't support startup snapshots yet. There are efforts underway to use startup snapshots, but until that is done, custom startup snapshot for e.g. Typescript are not yet possible.

kitsonk commented 6 years ago

deno is currently using V8 snapshots of TypeScript for the runtime. At the moment it isn't possible to compare a non-snappshotted version, but it certainly appears to have increased startup time from the old architecture.

DanielRosenwasser commented 6 years ago

Thanks for getting back to us @hashseed! We'll be keen to see any progress there, but there's certainly no rush. 🙂

tinganho commented 6 years ago

You can also make the compiler even more incremental. You add a step for indexing symbols in header/declaration files ahead of time. The index files contains locations of all symbols in one file. When the parser parses a source file, it parses the symbols as it encounters them. If a symbols lies in a declaration file, it does a lookup in the index, and parses that specific part only and resolves that type.

In this way, only code that is used is parsed.

hashseed commented 6 years ago

This is already being done to a certain degree with preparse data. We store function ranges and captured variables to avoid repeating to parse.

tinganho commented 6 years ago

You mean in node.js? I was more referring to the TS compiler. Haven't been following this project as close as before, but I think TS doesn't do this.

timocov commented 6 years ago

In case when you have a lot of TypeScript files it is possible that caching some fs results (fileExists, directoryExists and so on) may cause speedup of compilation time (see https://github.com/TypeStrong/ts-loader/issues/825#issue-354725524).

mweststrate commented 5 years ago

The package ncc recently published with Zeit might help in reducing the the initial startup speed as well: https://zeit.co/blog/ncc, as it bundles the compiler into a single JS file which can drastically improve performance

andykais commented 5 years ago

Not sure if this is the right place to talk about this, but I looked into tsconfig.buildinfo and saw its creating hashes for all the files in node_modules.

{
  "program": {
    "fileInfos": {
      "/home/andrew/Build/dev/scrape-pages/node_modules/typescript/lib/lib.es5.d.ts": {
        "version": "c8665e66018917580e71792b91022bcaf53fb946fab4aaf8dfb0738ed564db88",
        "signature": "c8665e66018917580e71792b91022bcaf53fb946fab4aaf8dfb0738ed564db88"
      },
      "/home/andrew/Build/dev/scrape-pages/node_modules/typescript/lib/lib.es2015.d.ts": {
        "version": "7994d44005046d1413ea31d046577cdda33b8b2470f30281fd9c8b3c99fe2d96",
        "signature": "7994d44005046d1413ea31d046577cdda33b8b2470f30281fd9c8b3c99fe2d96"
      },
      "/home/andrew/Build/dev/scrape-pages/node_modules/typescript/lib/lib.es2016.d.ts": {
        "version": "5f217838d25704474d9ef93774f04164889169ca31475fe423a9de6758f058d1",
        "signature": "5f217838d25704474d9ef93774f04164889169ca31475fe423a9de6758f058d1"
      },
      "/home/andrew/Build/dev/scrape-pages/node_modules/typescript/lib/lib.es2017.d.ts": {
        "version": "459097c7bdd88fc5731367e56591e4f465f2c9de81a35427a7bd473165c34743",
        "signature": "459097c7bdd88fc5731367e56591e4f465f2c9de81a35427a7bd473165c34743"
      },
    ...
  }
}

Given that

The "exclude" property defaults to excluding the node_modules, bower_components, jspm_packages and directories when not specified

I am guessing this is not something configurable currently and Im guessing the compiler just defaults to creating a hash for every single file the compiler takes in, but if we assume node_modules wont change, then we could remove all that signature creation and checking on every build. That would definitely lead to some speed-ups.

The only problem I could see is what happens when a node module is updated or removed, but I dont even know if typechecking is relevant to the .buildinfo file, or if it is purely for deciding which files need to be re-compiled

TimvdLippe commented 3 years ago

@hashseed The issue linked to the RFC (https://github.com/nodejs/node/issues/17058) was closed and https://github.com/nodejs/node/issues/35711 was opened as continuation. Is the continuation issue a requirement for the changes proposed in this issue or is the original RFC sufficient for improving TS startup performance? The startup performance is something we are attempting to tackle for Chrome DevTools (#40721) and snapshots could potentially help in this regard.

hashseed commented 3 years ago

The continuation is required, in particular the part "Enabling user land snapshot" is necessary to bundle a pre-loaded TSC so that we can save the time spent on loading TSC into memory upon startup.

frank-dspeed commented 2 years ago

user snapshot compilation is enabled by node 18 on build https://nodejs.org/en/blog/announcements/v18-release-announce/#build-time-user-land-snapshot-experimental

kurtextrem commented 5 months ago

Node v22 ships NODE_COMPILE_CACHE, so e.g.

"typescript.tsserver.nodePath": "NODE_COMPILE_CACHE=node_modules node",

could turn that on, right? Would it be worth adding an option / making this the default in vsc?

jakebailey commented 4 months ago

I've closed my attempt at using snapshotting: https://github.com/microsoft/TypeScript/pull/55830

The new NODE_COMPILE_CACHE works a lot better than us writing code to perform the hashing to determine if a preexisting snapshot is valid, see: https://github.com/microsoft/TypeScript/pull/55830#issuecomment-2027789386

@kurtextrem Not quite; that config is not a shell command, but a path passed to exec. You could write a wrapper script which does that, though.

Unfortunately, we can't just set NODE_COMPILE_CACHE from say, tsc.js or tsserver.js; the path is read right at Node's startup so that's too late, unlike v8-compile-cache. Maybe there's a world in which we immediately reexec a process (like I did in #55830), though. But executing processes are kinda slow on Windows, so I almost bet it'd be a net negative there.

jakebailey commented 4 months ago

@joyeecheung Do you see a world in which the compile cache can be enabled from within a running program, or where this option is just "always on" and benefits all programs?

kurtextrem commented 3 months ago

@jakebailey Thank you for making me aware! I tried the following wrapper (nodePath: "foo.sh"):

#!/bin/bash

node

but in that case TS never finishes IntelliSense status. Passing just nodePath: "node" works. Any ideas?

jakebailey commented 3 months ago

I would think you'd need to pass in some sort of relative path or stick that on your PATH.


Another update is https://github.com/nodejs/node/issues/53639 which would enable our entrypoints to enable caching themselves.

joyeecheung commented 2 months ago

Do you see a world in which the compile cache can be enabled from within a running program, or where this option is just "always on" and benefits all programs?

Sorry, missed the ping - https://github.com/nodejs/node/issues/53639 would allow a script to enable caching for another script/module, so technically, it can be enabled from within the same running program. To enable caching of a script by itself I don't have good ideas - maybe some directive would be possible to implement, but it's another can of worms about how acceptable it would be to add Node.js-specific directives effectively to the JS language, or whether parsing directive can lead to overhead themselves...