dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.09k stars 1.56k forks source link

`dart:analysis_server` takes 19GB memory #52447

Closed fzyzcjy closed 10 months ago

fzyzcjy commented 1 year ago

Well, I have reported here https://github.com/dart-lang/sdk/issues/41793#issuecomment-1554378872, but I am not sure whether it is more reasonable to create a new issue, since the memory size is really really a lot IMHO. The original issue complains when it takes <2GB, while here it is nearly 20GB!

image

incendial commented 1 year ago

@fzyzcjy you have dart_code_metrics plugin enabled, right? The plugin API is the problem.

incendial commented 1 year ago

If you want to check what is the difference in memory with the standalone, let me know.

fzyzcjy commented 1 year ago

@incendial Hi thanks for the reply, but indeed no. I do not enable the dart_code_metrics plugin when using the analysis server; indeed I run dart_code_metrics separately to check lint problems. See the screenshot below (no plugins).

image

incendial commented 1 year ago

Ah, my bad then, my assumption was wrong. 93 contexts though... 🥲

fzyzcjy commented 1 year ago

93 contexts though... 🥲

Well... I know it looks like a nontrivial number. But is that too many? I have not heard of documents saying that I should use packages as few as possible... (I have checked my code, it seems not very easy to merge those packages...)

The total number of Dart lines is at the quantity of 100kloc (e.g. 200kloc, I forget the exact numbers). There are a few packages that are huge, and many are small.

keertip commented 1 year ago

/cc @scheglov @srawlins

scheglov commented 1 year ago

@bwilkerson @jacob314 as an anecdotal evidence that users with many analysis contexts exist :-D

jensjoha commented 1 year ago

From the screenshot you're running an old version of dart. The new version should have a "Collect Report" option at the top of the "Analysis Server Diagnostics" page which could help us diagnose this problem. (It also contains some memory optimizations so updating might even help the problem in itself).

fzyzcjy commented 1 year ago

@jensjoha Thanks for the information! Yes I am using Flutter 3.7 (and dare not update to 3.10.0 since many people are saying it is buggy (though exciting)... so may have to wait for something like 3.10.3). I will update this post with the collected report once I upgrade.

srawlins commented 1 year ago

Well... I know it looks like a nontrivial number. But is that too many? I have not heard of documents saying that I should use packages as few as possible...

We've been meaning to publish something along these lines, but we don't have the data to support a number. But 93 is a lot. Our analytics are showing that the vast majority of IDE sessions are opened to less than 5 contexts. And there is (currently) a linear correlation between # of contexts and memory usage.

fzyzcjy commented 1 year ago

@srawlins Thanks for the information! Then may I know how is this possible to only have 5 contexts? My usage is as follows (happy to provide more information when needed):

  1. Totally at the quantity of 100kloc lines (e.g. 200kloc, I forget the exact numbers)
  2. Have many packages which each one does one thing, just like this -> https://github.com/flutter/packages , instead of one huge giant package
  3. Seems inevitable to split into multi packages for one (group of) things. Take my open source lib https://github.com/fzyzcjy/flutter_convenient_test/tree/master/packages as an example. To allow execution from pure-dart (e.g. command line), I have to have *_dart packages. To have some code to in testing, I added the *_dev packages.
srawlins commented 1 year ago

Then may I know how is this possible to only have 5 contexts?

A separate context is created for every pubspec.yaml file (technically a .dart_tool/package_config.json file) and for every analysis_options.yaml file (except a directory with a .dart_tool/package_config.json file and an analysis_options.yaml file only results in one context, not two). It would be interesting to know how many pubspec.yaml are found in the root where you run your IDE. And how many analysis_options.yaml files.

One way to reduce the number of contexts is to open the IDE window to a more narrowly focused directory, somewhere inside of the directory containing 93 contexts.

I expect most users have only one context in their DAS instance because their IDE is open to just one Flutter app.

Take my open source lib https://github.com/fzyzcjy/flutter_convenient_test/tree/master/packages as an example. To allow execution from pure-dart (e.g. command line), I have to have _dart packages. To have some code to in testing, I added the _dev packages.

Can you explain a little better why so many packages are needed to support one feature?

All of the READMEs point to the first one, but I don't think the first one explains what any of the others are. Is there a page describing these different packages? In particular, I'm curious to know why a *_dart package is needed, and why a *_dev package is needed.

fzyzcjy commented 1 year ago

@srawlins Hi thanks for the reply!

It would be interesting to know how many pubspec.yaml are found in the root where you run your IDE. And how many analysis_options.yaml files.

I checked the name of my 93 contexts listed in Analysis Server webpage. Seems that each name corresponds to a real Dart/Flutter package that I wrote. So it seems to be 93 pubspec.yaml and 93 analysis_options.yaml.

One way to reduce the number of contexts is to open the IDE window to a more narrowly focused directory, somewhere inside of the directory containing 93 contexts.

Well, I need all those packages and very frequently navigate here and there among them... So I cannot just throw away some of them inside the IDE :/

Can you explain a little better why so many packages are needed to support one feature?

Sure!

srawlins commented 1 year ago

Thanks for the explanation. I don't have any further advice on reducing the number of contexts in play.

fzyzcjy commented 1 year ago

@srawlins Thank you for the reply. Then maybe this is a legitimate scenario to have this many packages.

incendial commented 1 year ago

@srawlins how many people you need to upvote the issue to address the problem with monorepos and multiple contexts? I've spoken with too many people already that really suffer from the analyzer perf (without plugins) on a scale. I believe it's a small percentage of the overall Dart devs, but the thing is with time this number will only grow.

srawlins commented 1 year ago

A great question! Now that we have analytics, we don't need upvotes as a signal as much any more. Upvotes are still good, and analytics can't answer all the questions.

This is largely a question of ROI. Some of the ideas we've thrown around are very high effort and complexity (bug-prone) and maintenance cost, in order to serve users with many analysis contexts. So we weigh that ROI against other work in the queue.

One item I'd like to accomplish, which is not too complicated, is to not create new analysis contexts for each analysis_options.yaml file, unless that file specifies language experiments. However, in @fzyzcjy 's case, that would literally help 0. 93 pubspec files yield 93 contexts. So that idea might not have good ROI even though it's not too complex.

We have other ideas, and the priority is informed largely by analytics.

fzyzcjy commented 1 year ago

@srawlins I see. But I am a bit confused and curious: I am just an (individual) developer so my codebase is quite small. There must be (a lot of) teams and groups who are using Flutter to develop apps. On the other hand, you mentioned most people have few contexts. So I guess it is either because the large teams only have small number of packages (which I do not know how... since I have explained my scenario above and it seems it is hard to be reduced), or because most people use Flutter for small apps?

incendial commented 1 year ago

@srawlins so we basically come to a situation when people with actual performance problems can't get any attention to their problems because on a scale their problems are insignificant to you.

I'm asking about numbers, because I can easily find 200+ people with the same problem as @fzyzcjy has without any effort.

I understand that large codebases are locked into using the analyzer the way it works now (with no alternatives and no way to migrate off Dart / Flutter in a reasonable time), but are various hacks you suggest (like opening a subfolder) really the best option they can have in a long run?

Can you also give more details how exactly this analytics is collected? Do people who open hello-worlds count?

@fzyzcjy I believe it's the latter.

srawlins commented 1 year ago

I am just an (individual) developer so my codebase is quite small. There must be (a lot of) teams and groups who are using Flutter to develop apps.

CC @bwilkerson and @jacob314 for any insights and I don't know what user stories we can derive from just the analytics (it's pretty limited data), but as a hunch, I would guess that many users literally open their VS Code window to a single Flutter app, with no internal packages 🤷 .

Someone posited on a different issue that ~99% of Dart developers do not write "packages" which are published to pub; they are writing Flutter apps. It would be really interesting to get analytics that help us understand that.

(Most of my VS Code windows are a single analysis context but I won't claim to be a typical flutter developer by any stretch. I usually have 3-5 VS Code windows open to different packages I'm working on within a "few days" timespan, like linter, dartdoc, analyzer, markdown, mockito. I'm very happy with 3-5 VS Code windows 🤷 .)

@srawlins so we basically come to a situation when people with actual performance problems can't get any attention to their problems because on a scale their problems are insignificant to you.

I apologize if it feels that way. I haven't closed this issue as "Won't Fix." It's just a matter of binning into priorities. So I cannot promise a timeline for a better many-packages-in-workspace Developer Experience.

Can you also give more details how exactly this analytics is collected? Do people who open hello-worlds count?

CC @bwilkerson I imagine we publish somewhere what data we collect, right? But yes, I think even hello worlds count. I'm not sure how often the data is collected for each user, which might distinguish hello world users from large app / multi-context users.

incendial commented 1 year ago

they are writing Flutter apps.

Even a flutter app can be separated into packages. For example, if you have several apps with the same reusable core blocks (have discussed at least two cases like this).

Most of my VS Code windows are a single analysis context but I won't claim to be a typical flutter developer by any stretch.

Yep, me neither, so I can't judge from my exp, but talking to people that get no help upsets me a lot. DCM itself has 13 contexts which is fine for now (rarely get more than 1.5 GB).

linter, dartdoc, analyzer, markdown, mockito. I'm very happy with 3-5 VS Code windows 🤷

They all are not in a single monorepo (and not very related to each other in terms of code changes, at least not all of them), so you basically have no options 🙂.

I apologize if it feels that way. I haven't closed this issue as "Won't Fix." It's just a matter of binning into priorities.

How to say that... considering the overall planning approach, low priority thing can become a won't fix if there is always something more important. And as I know, the number of people with many contexts is low relative to the total number in the analytics data you have.

nhannah commented 1 year ago

I work on a project with a much greater number of LoC (but less contexts at 49) and we used to have a similar issue to this, in the end our fix was to remove almost all of the barrel files. We had a lot of circular references that both slowed down analysis greatly and caused memory pressure. This also put a stop to our ambition to further split our app into more packages in our monorepo as this would further push our use of barrel files. If you have a package A that 3 other packages use and you make a change to A any file referencing that package will re-analyze, as will any file referencing those files, and so on down the line. Our analysis time had become a crawl, unusable at times, this change (removing most barrel files) made it something we don't any longer think about. In addition we saw our memory usage drop from over 20GB that would continually leak higher as we switched branches and worked to a more stable ~2GB that barely grows.

bwilkerson commented 1 year ago

I imagine we publish somewhere what data we collect, right?

I'm not aware of anywhere that this information is published. That got me curious, so I tried to find out what Flutter documents, and the only thing I could find was information about crash reporting and how to disable analytics: https://docs.flutter.dev/reference/crash-reporting. I might have missed something, but I don't think Flutter provides that data either.

That said, I can confirm that we're collecting analytics for all projects no matter how large or small, so yes, if a disproportionate number of users (that have enabled analytics) are opening small hello-world sized packages that might impact the numbers.

incendial commented 1 year ago

Thanks, could you also please share how does it work for forks, for example? Or for projects copied from a template like it usually happens for job interviews? No differentiation?

bwilkerson commented 1 year ago

There's no differentiation. In part, at least, because the analyzer has no idea whether it's analyzing either a fork or something copied from a template.

srawlins commented 1 year ago

in the end our fix was to remove almost all of the barrel files. [...] In addition we saw our memory usage drop from over 20GB that would continually leak higher as we switched branches and worked to a more stable ~2GB that barely grows.

This is a very useful data point and new to me! @scheglov @bwilkerson this is really interesting data. Presumably in this case, the same number of files with the same number of elements is being analyzed, but we get a savings of 90% of memory usage, by changing the shape of the import graph.

This too might bump the ROI of some of our ideas about invalidation of the element model. Previously I thought this would help the really bad cases of DAS coming to a slow crawl, e.g. during a build_runner build or a git checkout. But if a different library invalidation model can greatly reduce memory consumption, that's a great win.

Alternatively, this data point may just speak to some serious memory leak in our code related to file handles or something similar. :/

bwilkerson commented 1 year ago

I understand why barrel files could lead to a spike in memory usage (by increasing the number of libraries in a cycle). But I don't understand why we'd be leaking memory as a result of them.

nhannah commented 1 year ago

One of my other co-workers caught this at one point when we were first looking into this: https://github.com/flutter/flutter/issues/95092#issuecomment-1246839977

srawlins commented 1 year ago

I understand why barrel files could lead to a spike in memory usage (by increasing the number of libraries in a cycle). But I don't understand why we'd be leaking memory as a result of them.

My guess was that barrel files increase the number of library invalidations, which increase the number of file reads. And if there is a memory leak found in reading a file, parsing it and resolving it, then that memory leak can thus be exacerbated.


I honestly don't even know the answer to this simple question, but I can guess: If foo.dart imports bar.dart, and bar.dart changes, then isn't it true that we must:

  1. Re-parse bar.dart (either from an overlay sent by and IDE, or from disk).
  2. Resolve bar.dart and update (replace) the LibraryElement for bar.dart and its enclosing elements.
  3. Update the LibraryElement, and enclosing elements for foo.dart. In order to do this, we can't use the existing element model, because it would point to dead elements like the previous bar.dart LibraryElement, etc. And we don't have the AST for bar.dart lying around. So... we have to read foo.dart from disk? And re-parse and re-resolve? I think yes. According to the comment below, we have a cached copy of the text, which we can re-parse and re-solve.
scheglov commented 1 year ago

MemoryByteStore was supposed to be used for tests. We also use it for (as we thought) short lived AnalysisContextCollection instances. Maybe we should use MemoryCachingByteStore around NullByteStore instead, so that it does not grow infinitely. Anyway, the analysis server itself does not use MemoryByteStore.

A reason why large library cycles might cause growth of heap is that when you link a library cycle, you have to materialize other libraries as you access elements from them. Materialized libraries take much more heap than just the serialized bytes with the libraries. It still should not be unlimited growth, just might be large.

And of course we can never dismiss a possibility of a bug that causes a memory leak.

FYI, we read files only when we are told that the file changed. Otherwise, we always keep the last read content, and parse, link, resolve it.

fzyzcjy commented 1 year ago

Someone posited on a different issue that ~99% of Dart developers do not write "packages" which are published to pub; they are writing Flutter apps

My 80 packages are not published to pub mostly as well. Indeed the packages serve for my main app. Btw, I have heard some big companies even have their own private registry (like a pub.dev).

I usually have 3-5 VS Code windows open to different packages I'm working on

Ah I see the problem: I am using Intellij IDEA, which Dart also officially supports. This IDE does not support opening multiple windows at all if I understand correctly!

I haven't closed this issue as "Won't Fix." It's just a matter of binning into priorities. So I cannot promise a timeline for a better many-packages-in-workspace Developer Experience.

I can totally understand, take your time and thank you all the same! Dart is great and evolving all the time. I am just hoping that the memory issue can be solved, but it is totally OK if it is not prioritized since I know there are a lot of work to be done for the Dart team.

https://github.com/dart-lang/sdk/issues/52447#issuecomment-1561662213 in the end our fix was to remove almost all of the barrel files

Interesting solution to the memory problem and thanks! I may try that later when refactoring and report results here.

julemand101 commented 1 year ago

@fzyzcjy

Ah I see the problem: I am using Intellij IDEA, which Dart also officially supports. This IDE does not support opening multiple windows at all if I understand correctly!

I don't want this issue to be about what Intellij IDEA can or cannot do but I just want to add that Intellij does absolutely support multiple windows and support opening multiple projects at the same time. E.g. just try open another project and it should ask you if you want to open the project in another window:

image

But maybe you have selected "Don't ask again" at some point.

fzyzcjy commented 1 year ago

@julemand101 Interesting, thanks for pointing out! I will try that and see whether it works, and the memory overhead of intellij idea windows outweighs the memory of dart analyzer

jensjoha commented 1 year ago

For anyone who gets the many GB memory usage: Is any of it open source so we can try to reproduce it?

Also: has anyone with these problems tried the new version of dart to see what difference that makes?

It includes several fixes related to memory, especially with many contexts. For instance https://github.com/dart-lang/sdk/commit/9306752ba71352bd8ac39594ad151c34c8ab128a in my tests (with 8 contexts) reduced memory usage between 10-33%.

It also contains the "Collect Report" feature that I mentioned in https://github.com/dart-lang/sdk/issues/52447#issuecomment-1556635461 which could really help us out (though an open source reproduction would obviously be better).

PS: Dart 3 should also include https://github.com/dart-lang/sdk/commit/2b8ca6f2f5b48bf74a839222066c5b9c9f50d2eb (see https://github.com/dart-lang/sdk/issues/51126 for details) will likely better allow the VM to release (unused) memory.

PPS: I'd also like to give some attention to https://github.com/dart-lang/sdk/issues/51127 which would likely help a lot if fixed.

olexale commented 1 year ago

For those encountering performance issues, I would strongly suggest activating the "Only analyze project with open files" option if you're using Visual Studio Code (VSC), or the "Scope analysis to the current package" option if you're using AS/IntelliJ. Without these settings, in our project which consists of ~270 packages, the analyzer consumes around 16.88 GB of memory (Dart 3.0.2); however, with these settings activated, the memory usage was reduced to a maximum of 1 GB.

There's a trade-off here, as enabling these settings will limit your visibility on how modifications in one package may potentially affect others. This means you'll need to lean more heavily on CI checks. The advantage, however, is that your IDE's autocompletion feature becomes usable again, and you can save files without having to endure a lengthy wait (~1 minute) for the formatter to finish its task.

incendial commented 1 year ago

@olexale do you also heavily rely on barrel files and part of directives?

olexale commented 1 year ago

@incendial, we use a single barrel file per package to expose classes that are hidden under the src folder, and we heavily rely on code generation, so we use part of directives a lot.

fzyzcjy commented 1 year ago

@jensjoha From the screenshot you're running an old version of dart. The new version should have a "Collect Report" option at the top of the "Analysis Server Diagnostics" page which could help us diagnose this problem. (It also contains some memory optimizations so updating might even help the problem in itself).

Hi, I upgraded to Dart 3.0 and here is the report. (It is 16-18GB at the time I create the report, but sometimes it again becomes 19GB. I also removed and added some packages after the first post of the issue, so the number of packages will not exactly match what I said before)

dart_analyzer_diagnostics_report.json.zip

spkersten commented 1 year ago

We also have many package for our project (an app), about ~120. Running all tests and codegen takes ~1h for the whole code base. So to be able to qualify pull requests in an acceptable time and to be able to run tests/codegen while developing, we split the code into packages for individual features, so we can limit test/codegen to only packages that are affected by the work you're doing. Over time, the number has grown with the number of features (and their interactions) and to avoid circular dependencies.

jensjoha commented 1 year ago

Thanks @fzyzcjy! That's an interesting report.

It has 40,611,996 _Entry objects directly taking ~1.2GB of ram. Looking at a local running instance of the analyzer I also have (much much fewer) _Entry classes --- which has a Map in it. Considering the report also has 40,991,788 _Map objects directly taking ~2.4GB of ram it seems likely that this in itself uses ~3.6GB.

(looking)

So I seem to be able to reproduce this in Windows (though Mac seems to be the same, Linux isn't) --- with a session with now 40,434,365 _Entrys... The "Retained size" according to Observatory is 7.39GB (though, to be fair the memory usage now is also 17.4GB --- so maybe there are more leaks and/or the observatory number is off). Anyway, I should probably fix that.

fzyzcjy commented 1 year ago

@jensjoha Thanks for the information!

A bit more context (for the zip I provided): It goes to the huge memory after using for a while (instead of being there in the first minute), so I also expect it can be some kind of leaks.

polina-c commented 1 year ago

I recently enabled auto-snapshotting for analyzer, in case of memory overuse. Here is instruction how to configure it: go/analyser-memory-troubleshooting-with-auto-snapshotting

polina-c commented 1 year ago

Public document with instructions how to setup auto-snapshotting for analyzer in case of huge memory usage: https://docs.google.com/document/d/1mxCKhvyHUtGVhIH5Apx9y7V7itqCxAQPse6ghbjaDMM

polina-c commented 1 year ago

@fzyzcjy @spkersten

Does the issue still reproduces for you? May I ask you to configure auto-snapshotting in your IDE? Contact me directly at https://twitter.com/polinalinac if you have issues.

fzyzcjy commented 1 year ago

@polina-c Yes and no. It is ~10GB today (though it is still quite huge, it is much better than ~19GB). I am happy to configure auto snapshotting, but before that I need to confirm - I am on stable channel, is it OK, or do I need master Flutter?

jensjoha commented 1 year ago

FYI I would expect the issue to be fixed* on Flutter master --- the likely fix (https://github.com/dart-lang/sdk/commit/5aee0f61b6301099bb25a36d2b2ff197541d3c5b) (and https://github.com/dart-lang/sdk/commit/5e08794bbfc2c57b5b0522c6e2b22b419f6aa115 though that "leak" is only temporary) has landed a good while back. The fix seems to have been chery-picked in stable and landed (https://github.com/dart-lang/sdk/commit/7475c2c6c2df2c0a882a28a0e84606202f3151a8) which I think means it will be included in the next stable release (though I don't know when that will be).

* At least I'd expect it to get way down from 19/10 GB --- we previously saw that @fzyzcjy had lots of _Entry objects.

fzyzcjy commented 1 year ago

That looks great! I will wait for next stable and report the updates here.

polina-c commented 1 year ago

@polina-c Yes and no. It is ~10GB today (though it is still quite huge, it is much better than ~19GB). I am happy to configure auto snapshotting, but before that I need to confirm - I am on stable channel, is it OK, or do I need master Flutter?

Auto-snapshotting is available starting Dart 3.1.0-200.0.dev. Current Flutter stable has 3.0.5. So, master has it, but stable does not yet.

gabrielgarciagava commented 1 year ago

@fzyzcjy The fix was released on flutter version 3.10.6. You can give it a try to see how is the memory usage now.

fzyzcjy commented 1 year ago

@gabrielgarciagava I will do that soon (when focusing on frontend again)!