nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
106.56k stars 29.05k forks source link

Use the compilation cache when running typescript files through `--experimental-transform-types` #54741

Open ShenHongFei opened 1 week ago

ShenHongFei commented 1 week ago

What is the problem this feature will solve?

The flags --experimental-strip-types and --experimental-transform-types enable Node.js to run almost all TypeScript files. https://github.com/nodejs/node/pull/54283

This feature of running .ts files directly is great. I recently removed the step of compiling .ts files to .js and started the project directly through node entry.ts, and then directly imported other .ts modules.

However, in a large project with many files, compiling .ts files takes up a lot of startup time (about 200ms for all files using .js, and about 700ms for .ts). If there is a compilation cache, skipping the compilation of the same .ts file and directly using the .js compilation result, it will be very efficient.

What is the feature you are proposing to solve the problem?

After I got this idea, I tried to simply modify lib/internal/modules/esm/translators.js and add local file cache. Now the speed of running .ts file for the second time is exactly the same as .js.

I hope someone can combine .ts compilation cache with the current .js compilation cache in node.js.

diff --git a/lib/internal/modules/esm/translators.js b/lib/internal/modules/esm/translators.js
index b1e7b860..8eb172b1 100644
--- a/lib/internal/modules/esm/translators.js
+++ b/lib/internal/modules/esm/translators.js
@@ -24,8 +24,9 @@ const {

 const { BuiltinModule } = require('internal/bootstrap/realm');
 const assert = require('internal/assert');
-const { readFileSync } = require('fs');
-const { dirname, extname, isAbsolute } = require('path');
+const { readFileSync, writeFileSync } = require('fs');
+const { dirname, extname, basename, isAbsolute } = require('path');
+const crypto = require('crypto')
 const {
   assertBufferSource,
   loadBuiltinModule,
@@ -484,11 +485,38 @@ translators.set('commonjs-typescript', function(url, source) {
   return FunctionPrototypeCall(translators.get('commonjs'), this, url, code, false);
 });

+
+// --- patch: Compile .ts esm files with typescirpt and cache
+const filepath_cache = process.env.NODE_COMPILE_CACHE + '/'
+
 // Strategy for loading an esm TypeScript module
 translators.set('module-typescript', function(url, source) {
-  emitExperimentalWarning('Type Stripping');
-  assertBufferSource(source, false, 'load');
-  const code = stripTypeScriptTypes(stringify(source), url);
-  debug(`Translating TypeScript ${url}`);
-  return FunctionPrototypeCall(translators.get('module'), this, url, code, false);
-});
+    emitExperimentalWarning('Type Stripping');
+    assertBufferSource(source, false, 'load')
+    
+    const str_source = stringify(source)
+    
+    const [
+      code = stripTypeScriptTypes(str_source, url),
+      filepath_js
+    ] = (() => {
+        const filepath_js = filepath_cache +
+            url.slice('file:///'.length).replaceAll(':', '_').replaceAll('/', '_').slice(0, -2) +
+            crypto.hash('sha256', str_source).slice(0, 8) +
+            '.js'
+        
+        try {
+            return [readFileSync(filepath_js, 'utf-8')]
+        } catch { }
+        
+        return [undefined, filepath_js]
+    })()
+    
+    if (filepath_js)
+        try {
+            writeFileSync(filepath_js, code)
+        } catch { }
+    
+    return FunctionPrototypeCall(translators.get('module'), this, url, code, false)
+})
+

@nodejs/typescript @marco-ippolito @joyeecheung

What alternatives have you considered?

No response

marco-ippolito commented 1 week ago

this is interesting, we definitely should cache. I think we already have a cache in place for js files but I'm not 100%

joyeecheung commented 1 week ago

I think the idea is great though the implementation definitely needs more polishing:

  1. It will need to work with the compile cache directory structure (documented here https://github.com/nodejs/node/blob/01c88f913601bc43c149d76a70e95df77d23dc1b/src/compile_cache.cc#L359)
  2. There needs to be some checksum in the file to defend against changes to the source code / cache (this is what we have for JS code caches https://github.com/nodejs/node/blob/01c88f913601bc43c149d76a70e95df77d23dc1b/src/compile_cache.cc#L286)
  3. I am not sure whether it should be a two tier TS -> JS + JS -> code cache, or just a TS -> code cache - I suppose if we expect people to commonly use custom loaders to do anything before or after TS -> JS transformation, then it needs to be two tier, because the TS -> JS phase is exposed to loaders.
  4. Also, the checksum should probably include typescript-specific flags (e.g. whether transform is done). Switching from type-stripping to transform-types or vice versa should result in cache misses.
RobsonTrasel commented 1 week ago

Wow, the proposed solution is pretty cool

marco-ippolito commented 1 week ago

I started playing with it, and I realized it probably doesnt make sense. Users should transpile to js with a compilation step and then cache. There is no point in persisting a transpiled ts on disk since its what tsc does. You can already cache the js compiled module.

ShenHongFei commented 1 week ago

The main impact of having a compilation cache is the running speed. The cache can save processor resources and exchange space for time. After all, there is no obvious disadvantage in adding a cache.

TypeScript is an interpreted language like JavaScript, not a compiled language. Compiling all .ts files in advance and then running is not very user-friendly. It always feels like two steps in spirit. Compilation and type checking should be a separate optional process rather than a must-have.

In a node.js project, I hope to write import { xxx } from 'b.ts' in a.ts to complete the import, so the following four options are required in tsconfig.json

{
    "compilerOptions": {
        "module": "ESNext",
        "moduleResolution": "Bundler",
        "noEmit": true,
        "allowImportingTsExtensions": true
    }
}

Because "allowImportingTsExtensions": true requires "noEmit": true,, I can't generate any files with tsc, and Typescript will not automatically modify the suffix of the imported file.

Now many other javascript runtimes, such as deno, bun, support running .ts files directly instead of compiling first, and the performance is very good. I don't know if they cache.