microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
100.64k stars 12.44k forks source link

TSC watch / Out of memory errors with recursive template literal types #47419

Closed silviogutierrez closed 2 years ago

silviogutierrez commented 2 years ago

Bug Report

When using watch and template literal recursion, TSC tends to run out of memory. How soon depends on the platform and memory size, but on a powerful Silicon Mac , it takes 2-3 modifications to the code base. On a memory-limited Ubuntu VM, a single save after the initial compilation will error out.

šŸ”Ž Search Terms

watch recursion template literal types memory heap

šŸ•— Version & Regression Information

Tested with 4.4, 4.5, and nightly. Happens in all, consistently.

Tested on MacOS Silicon and Ubuntu.

Please keep and fill in the line that best applies:

āÆ Playground Link

This uses tsc watch, so a link to the playground is not possible. I provide sample code below.

šŸ’» Code

https://github.com/silviogutierrez/tsc-recursion-bug

Using node 14 and yarn.

To run into the issue:

  1. git clone https://github.com/silviogutierrez/tsc-recursion-bug.git
  2. yarn tsc --noEmit --watch
  3. Note the initial compilation is very fast.
  4. Open an editor and materially change client/routes.tsx. Change the name, etc.
  5. Further compilations are slow, and eventually you'll get a massive out of memory error.

šŸ™ Actual behavior

Out of memory error with a long stack trace. Very little info on why.

šŸ™‚ Expected behavior

Quick compilation that takes about the same time as the initial compilation. Not only is it much slower when it does work, but it crashes eventually.

More background

The issue is caused by ExtractRouteParams in monoroutes.tsx. If you replace the definition with:

type ExtractRouteParams<T> = any

The problem immediately goes away. Obviously this isn't what we want. You can review routes.tsx to see the intent of this type. That is, to strongly type params between string and numbers.

Moreover, the problem is related to route being a recursive function. See routes.tsx for an example of how it's used to create sub routes that share paths.

Other things that "fix" the error:

Happy to provide more context. Normally I would say this is a real edge case, but the fact that it only affects watch functionality and not the initial compile leads me to believe it needs to be addressed. Thanks for all the work on typescript!

jakebailey commented 2 years ago

An investigation update: I ran a profile of this and noticed that it's OOMing in typeToTypeNode while recursively cloning nodes. I thought (based on where it was) that #43973 would be related, but after noting that old versions crash nearly immediately with:

E:\work\TypeScript\built\local\tsc.js:110178
                throw e;
                ^

RangeError: Maximum call stack size exceeded
    at getSymbolChain (E:\work\TypeScript\built\local\tsc.js:49643:41)
    at lookupSymbolChainWorker (E:\work\TypeScript\built\local\tsc.js:49629:51)
    at lookupSymbolChain (E:\work\TypeScript\built\local\tsc.js:49622:24)
    at symbolToExpression (E:\work\TypeScript\built\local\tsc.js:49953:29)
    at E:\work\TypeScript\built\local\tsc.js:48667:106
    at withContext (E:\work\TypeScript\built\local\tsc.js:48707:37)
    at symbolToExpression (E:\work\TypeScript\built\local\tsc.js:48667:28)
    at symbolToStringWorker (E:\work\TypeScript\built\local\tsc.js:48586:30)
    at Object.usingSingleLineStringWriter (E:\work\TypeScript\built\local\tsc.js:13575:13)
    at symbolToString (E:\work\TypeScript\built\local\tsc.js:48584:73)

I bisected to #43733 where this seems to have turned from a stack overflow into an OOM, so this isn't new (but the OOM is).

Of course, this is interesting, but doesn't explain why the first compilation is fine but the next dies (one way or another).

silviogutierrez commented 2 years ago

@jakebailey : thanks for the update. Let me know if there's any additional information I can provide to help.

As I mentioned in the ticket, if you simplify ExtractRouteParams to only extract uncasted params, then the problem goes away and re-compilation is near instantaneous. Maybe that helps narrow down the issue.

jakebailey commented 2 years ago

Narrowed this down further; it seems like this is a combo of two things (so far):

  1. There seems to be a bug in visitAndTransformType; it marks the current type as visited, recurses, then marks the type as unvisited, but doesn't consider that it may be currently visiting it in a caller (so it's already marked, then unmarked, potentially breaking the in progress caller). Forcing it to return elided info when it happens stops the infinite growth, but it's not clear to me that is correct.
  2. Even if the growth is halted, each change adds a few hundred MB of memory usage that is never released. It's possible that this is the "real" bug, and the recursion I'm stopping is valid (and the allocations not).
jakebailey commented 2 years ago

For (2), it turns out that nothing is holding onto things other than the GC itself in its "old space", so once it hits the watermark (typically 2GB) it frees all of that memory, so I think it's just that there's an extra limit that needs to be applied, i.e. (1) but not instantly (which I've verified to work). I likely experience this because my machine has a large amount of memory so it's not uncommon for GC'd languages to not bother to GC anything as there's lots more memory still free.

As for why this triggers only on after the first change in watch mode, I'm unsure, and working on that (as it may offer a better explanation).

EDIT: while my first part is mostly correct, I was testing on a revision which had a much smaller limit than what it should, so after the first change the memory usage jumps up to nearly 4GB, so the limit itself is probably not enough.

silviogutierrez commented 2 years ago

@jakebailey : is there any way I can help diagnose or fix the issue? I would love to see this fixed in 4.7 as it really bogs down the development experience.

jakebailey commented 2 years ago

I'm not sure; my investigations into this weren't very fruitful and I sort of got to the point where it seemed like the type was just "really big". I know that I tried introducing some extra limits, but I think my implementation of them I was testing wasn't quite enough and broke a number of existing tests which test this sort of recursion.

I did at least figure out why it only broke on the watch event but not startup; there are some optimizations in the watcher that defers a lot of emit work until the first event. Before that change, your code blows up the compiler immediately.

I'll try and dig up my branch (or recreate it) and see if some limit is viable for 4.7, I've just been busy working on another large work item.

silviogutierrez commented 2 years ago

@jakebailey : understood, thank you for all the time you've put in.

I suppose it's entirely possible this is just "too big" a type. In those cases, I would imagine the compiler could just stop with a friendlier message, maybe about unlimited recursion, or the like. But I imagine this is hard to detect till its too "late".

As always, if you need me to test anything or can help out in any way, let me know.

jakebailey commented 2 years ago

My limiter change does work in that it prevents a crash, though thanks to how node's GC works, the process will appear to be using a lot of memory (depending on when the GC decides it wants the space back). Maybe that's sufficient to "fix" this, but the compilation takes quite a while so I wouldn't expect the type to become useful. I'll keep poking it, though.

I will say that a lot of the problem seems to be that it's big in both directions; normally we would hit a limit, but the type is also really "wide" in that each type here contains quite a few more types that are also recursed on (which I don't think we have much in the way of limits to).

jakebailey commented 2 years ago

@silviogutierrez I sent #48581, which should have an NPM package produced for you to test shortly. The PR's not complete (I have to make a test from your repro), but it doesn't OOM for me anymore. It's not fast, either, but... šŸ˜…

EDIT: here https://www.npmjs.com/package/@typescript-deploys/pr-build/v/4.7.0-pr-48581-7

jakebailey commented 2 years ago

I don't think my fix is actually "correct" enough; importing your test case as one of our tests, the .types baseline is a whopping 52MB, and the test takes some 45 seconds to complete, worse than any other test.

silviogutierrez commented 2 years ago

@jakebailey : tested it in the repro and it appears to work! It's not really any slower than the initial compilation, which I imagine is slow just because of the complexity of the type.

So I think the speed is an orthogonal issue: this is just a slow type, and the fact that initial + watch take about the same time seems acceptable. Even if it's not ideal.

Let me know if you need any other tests or feedback. Thanks again!

jakebailey commented 2 years ago

In writing the test for this, I noticed that this only reproduced when the test was split into two files; having them in the same file was always fast.

It turns out that the reason why this is so hard is because the ExtractRouteParams is unexported; during emit, there's no easy way to reference the type, so it has to recurse and try to print out the type as you've instantiated it.

If you just export ExtractRouteParams, it can be named in another file and everything becomes instant.

Does that workaround work for you? I think that may be more desirable for now until we can come up with some limits that work (adding new limits inevitably breaks someone who relied on them).

silviogutierrez commented 2 years ago

Oddly enough, it does go faster. And compiles fine the first time. But second+ times all fail with type errors and things further down the chain (basically, things that consume routes.tsx) have the params typed as any.

I will setup a repro if I can.

jakebailey commented 2 years ago

A repro would help, because in your original case, I was able to add export and then start modifying the routes.tsx file without issue.

The types show up correctly; the name applies:

image

silviogutierrez commented 2 years ago

Ok, here's a repro, and it's almost certainly a completely unrelated bug. But it deals with recursion too.

Please pull in https://github.com/silviogutierrez/tsc-recursion-bug and do git checkout maybe-different-bug to get the branch showing it.

You can use the tsc in package.json. No need to use your fix or the later versions.

Now: run tsc --noEmit --watch and notice it'll compile just fine.

Now update routes.tsx as before, change something trivial, like the path of a route. Notice now inference breaks and you'll get:

[2:13:55 PM] File change detected. Starting incremental compilation...

client/consumer.tsx:4:32 - error TS7006: Parameter 'props' implicitly has an 'any' type.

4 routes.UpdateTrinket.withHooks(props => {

So it seems watch is somehow stateful with inference and it breaks on recompiles.

Notes

Again, this seems like a different bug so I'm happy to file a new issue. Thanks again!

jakebailey commented 2 years ago

I'm not sure if it's different or the same, but I think it's interesting to note that if you set "declaration": true, then you'll see that it emits an error for routeFactory saying the type is too big to serialize, and the editor will complain about props having an implicit any, likely because the type is again too big for it to do anything with, it's just not crashing (probably due to the export?).

jakebailey commented 2 years ago

The same happens for the original case, too.

I feel like there must be a way to construct these types without hitting so many recursion limits. The intersections + conditional types seem to be a recipe for hitting limits; maybe defining more intermediary types would give the compiler more places to stop trying to clone entire type trees?

silviogutierrez commented 2 years ago

That's helpful, thanks. Would intermediary types inside a function still help? or will TS only use intermediary types at the top level scope? But then again, I'm not quite sure what exactly an intermediary type would look like here.

I don't want to take up your time and turn this into a tech support session just for my project. But I do wonder why others don't run into this more often. Am I pushing the type system too hard?

jakebailey commented 2 years ago

Probably not inside, but I haven't checked.

The thing to consider here is that for any type that is in an emitted d.ts, there must be a way to write that type. When you don't name types, TS has to write out a copy of that type. If the type is recursive, it will recurse. If it's too big, it'll stop and give an error saying "I can't write this type for you". If you were to want to write this d.ts file yourself, you'd probably name some of the types and reference them in other files, which is where the workaround fixes things.

It's probable that you're pushing it too hard, yes. You're doing intersections + conditional types + string template types. The conditional types are long and nested, intersections often mean that they need to be preserved, etc. It shouldn't be the case that we crash, that's still not good, but there may also be ways to write what you're intending without getting so fancy that bad things happen.

silviogutierrez commented 2 years ago

@jakebailey : I'm closing this out as the factoring out fixed things. Indeed the recursion was causing the issues. Thanks again for all your time and help.