Open mraleph opened 2 years ago
/cc @jensjoha @scheglov
Any update on this please @mraleph
I would like to add that personally, I've been observing this issue for quite some time and that there is a direct link between using barrel files and performance.
Context: I'm working on a project which is fairly compartmentalised. The structure of the project can roughly be described as:
- feature1/
- subfeature/
- file.dart
- subfeature.dart - exports all from subfeature/*
- feature2/...
Other features export another's feature export file.
This works really well for organising code and initially when the project was small the analyzer performance was great.
Now after a few 1000s of files later, the performance can get quite bad (+10s for completions) when editing files that are imported by many other files.
Sometimes, when I'm editing a file that is directly or transitively used by many other libraries, the performance gets so bad, I just uncomment it's export definition from the barrel file, and the analyzer's responses are lightning fast again.
This issue has got me thinking...
Would it be possible for the analyzer to be made smart enough to detect boundaries of effect and only analyze the code that would be directly affected by a given change?
For instance, suppose we have:
class A {
void doB() {
print('C');
}
}
If I edit the body of doB
for instance changeprint('C');
to print('A');
, the analyzer should only have to reanalyze the body of doB
since nothing outside the body is affected by the change.
Similarly, if I edit a private class, even add methods to it, the only code affected would be that of the library.
@scheglov Do you perhaps know, does the analyzer do any of this?
When you edit just a method body, the analyzer:
LibraryElement
s for this library, and any library that depends on it (might be this 1000 files in a single library cycle). This is fast (breaking is faster than building).LibraryElement
for it. Because all API signatures are the same, we reuse cached serialized LibraryElement
(the summary). But deserializing costs something. This has low cost, we load mostly just header to know which libraries we have in the summary.We don't have ability to analyze individual method bodies, only whole files. Although we do this for completion, but in these cases we skip computing diagnostics.
But when you edit a class, even private, then API signatures do change, and this causes a lot of work to relink all libraries, and reanalyze all files that depend on the changed file.
Optimizations are not impossible, but not trivial to implement and keep robust. I suspect that this will require more effort than we can afford right now.
Thank you for the explanation.
@scheglov If it helps with diagnosing the root cause, I was able to write up a reproducer for this issue.
// generate.dart
import 'dart:io';
const _core = 'core.dart';
final _folders = <String>[
...'abcdefghijklmnop'.split(''),
];
final _subfolders = [
...'qrstuvwxyz'.split(''),
];
final _filenames = [
for (var i = 0; i < 10; i++) 'f$i',
];
void main(List<String> args) async {
for (var folder in _folders) {
final batch = <Future<void>>[];
for (var subfolder in _subfolders) {
for (var file in _filenames) {
batch.add(putFile(folder, subfolder, file));
}
batch.add(_writeToFile(_rooted([folder, subfolder]) + '.dart', (sink) {
sink.writeExportFileContents(libraries: _filenames.map((folder) {
return '$subfolder/$folder.dart';
}));
}));
}
batch.add(_writeToFile(_rooted(['$folder.dart']), (sink) {
sink.writeExportFileContents(libraries: _subfolders.map((subfolder) {
return '$folder/$subfolder.dart';
}));
}));
await Future.wait(batch);
}
await _writeToFile(_rooted([_core]), (sink) {
sink.writeCoreFileContents(folders: _folders);
});
}
Future<void> putFile(String folder, String subfolder, String file) async {
final baseClass = '${folder.ucFirst}${subfolder.ucFirst}${file.ucFirst}';
final parts = <String, String>{
'part1': 'Impl1 implements ${baseClass}Source',
'part2': 'Impl2 implements ${baseClass}Source',
'part3': 'Impl3 implements ${baseClass}Source',
};
final fileBase = _rooted([folder, subfolder, file]);
final lib = '$file.dart';
await Future.wait([
_writeToFile(fileBase + '.dart', (write) {
write.writeFileContents(
baseClass: baseClass,
filename: file,
parts: parts.keys,
);
}),
for (var MapEntry(key: partName, value: suffix) in parts.entries)
_writeToFile(fileBase + '.part.$partName.dart', (write) {
write.writePartFileContents(
library: lib,
className: '$baseClass$suffix',
);
}),
]);
}
Future<void> _writeToFile(
String filePath,
void Function(StringSink sink) write,
) async {
final file = File(filePath);
if (!await file.parent.exists()) await file.parent.create(recursive: true);
final sink = file.openWrite();
write(sink);
await sink.flush();
await sink.close();
}
extension on StringSink {
void writeFileContents({
required String baseClass,
required String filename,
required Iterable<String> parts,
}) {
writeln('import "../../core.dart";');
writeln();
for (var partName in parts) {
writeln('part "$filename.part.$partName.dart";');
}
writeln();
writeln('class $baseClass {');
writeln(' static const useOfCore = Core.hello;');
writeln('}');
writeln('class ${baseClass}Source implements $baseClass {}');
}
void writePartFileContents({
required String library,
required String className,
}) {
writeln('part of "$library";');
writeln();
writeln('class $className {}');
}
void writeExportFileContents({
required Iterable<String> libraries,
}) {
for (var file in libraries) {
writeln('export "$file";');
}
}
void writeCoreFileContents({
required List<String> folders,
}) {
for (var folder in folders) {
writeln('export "$folder.dart";');
}
writeln('class Core {');
writeln(' static const hello = "Hello, World!";');
writeln('}');
}
}
String _rooted(List<String> parts) {
return (['lib']).followedBy(parts).join(Platform.pathSeparator);
}
extension on String {
String get ucFirst {
return this[0].toUpperCase() + this.substring(1);
}
}
Steps:
generate.dart
dart run generate.dart
from the root of the project. (This will generate the necessary source code)lib/a/z/f0.dart
class AZF0 {
static const useOfCore = Core.hello;
}
To
class AZF0 {
static const useOfCore = Core.hello + Core.hello; // or AZF3.useOfCore
}
lib/a/z.dart
export "z/f0.dart";
lib/a/z/f0.dart
and attempt to perform the same edits as before.While this example is contrived, the dependencies between different libraries and the general structure closely resemble that of the real-life project I'm working on.
One other note I'd like to add is that in this example (and our live project) we have a circular dependency. There is a core.dart
file which exports all files and is subsequently used by the leaf libraries. This might seem like it wouldn't happen in real world.
However, in our project, we have a genuine situation like this. We have code generators that pull specific configurations from different parts of the project and collect it in one library. That library is then also used by other parts of the app. Which probably leads to a certain file transitively importing most of the application simply because it uses the joint configuration.
I hope this helps.
Environment: dart --version
Dart SDK version: 3.5.0 (stable) (Tue Jul 30 02:17:59 2024 -0700) on "macos_arm64"
Freshly deleted ~/.dartServer/.analysis-driver/
Editor
Version: 1.93.0-insider
Commit: 4693ac3d4dce852b747d08d4bc54db5e35afe268
Device, OS
MacBook Air M2 2022, Sonoma 14.6.1
Thank you for the code that generates the test case.
The reason we don't fix this is not that we don't know about the issue. But implementing incremental resolution will require more effort than we can afford right now.
OK. So, this isn't a case of:
The analyzer has a small bug in resolving libraries.
, but ratherThe way the analyzer resolves libraries is suboptimal.
,The thing about the test case that made it seem like there could be a simple answer is the fact if I removed export "z/f0.dart";
(the file I was editing), the performance was restored.
I hope at least the additional test case helps.
Sorry for the noise.
Yes, the way the analyzer works is not performant in this case.
The analyzer builds element models for Dart libraries. But it does it for whole library cycles, where a single library cycles consists of libraries that import or export each other. And when you change any file in this library cycle, the whole cycle should be linked.
When you remove export
, this probably moves z/f0.dart
out of the library cycle, so editing it does not invalidate the (large) library cycle, and so it is much faster.
The generated files look like this:
// lib/core.dart
export "a.dart";
export "b.dart";
export "c.dart";
export "d.dart";
export "e.dart";
export "f.dart";
export "g.dart";
export "h.dart";
export "i.dart";
export "j.dart";
export "k.dart";
export "l.dart";
export "m.dart";
export "n.dart";
export "o.dart";
export "p.dart";
class Core {
static const hello = "Hello, World!";
}
// lib/a.dart
export "a/q.dart";
export "a/r.dart";
export "a/s.dart";
export "a/t.dart";
export "a/u.dart";
export "a/v.dart";
export "a/w.dart";
export "a/x.dart";
export "a/y.dart";
export "a/z.dart";
// lib/a/z.dart
export "z/f0.dart";
export "z/f1.dart";
export "z/f2.dart";
export "z/f3.dart";
export "z/f4.dart";
export "z/f5.dart";
export "z/f6.dart";
export "z/f7.dart";
export "z/f8.dart";
export "z/f9.dart";
// lib/a/z/f0.dart
import "../../core.dart";
part "f0.part.part1.dart";
part "f0.part.part2.dart";
part "f0.part.part3.dart";
class AZF0 {
static const useOfCore = Core.hello;
}
class AZF0Source implements AZF0 {}
Wouldn't removing z/f0.dart
from lib/a/z.dart
still cause a computation of a large library cycle?
The file lib/a/z/f0.dart
still imports lib/core.dart
and by extension all other libraries.
Wouldn't removing
z/f0.dart
fromlib/a/z.dart
still cause a computation of a large library cycle?
It makes z/f0.dart
no longer a (transitive) dependency of the core.dart
file, which removes it from the cycle.
The issue really is not barrel files themselves, it is barrel files being imported by the libraries they export. That creates a cycle, because now every library depends on every other library (transitively, through core.dart
exports).
If you removed the core.dart
import from the top of the generated z/fx.dart
files, it would also fix the issue.
Removing export of z/f0.dart
will cause computation of a large library cycle, once.
This will cost you something, but at least after that typing in z/f0.dart
during an attempt to get code completion will not immediately invalidate this same library cycle and cause computation during code completion.
Removing export of
z/f0.dart
will cause computation of a large library cycle, once.This will cost you something, but at least after that typing in
z/f0.dart
during an attempt to get code completion will not immediately invalidate this same library cycle and cause computation during code completion.
After my previous comment, I realised what you were probably communicating to me was that once z/f0.dart
was removed no other library was using it and as such edits to it didn't cause reanalysis of other libraries.
Thank you for putting up with me.
I just have one other idea, which isn't necessarily related to this issue but might help ease it.
What if the analyzer had an option to only cause reanalysis of depending libraries when a given library is free of syntax errors?
If I understand correctly, while I'm editing a file the analyzer only needs to know about the definitions of other files I use in this file in order to provide me with completions and other language features.
The analyzer only needs to reanalyze other files in order to prove their correctness as a result of changes I made to this file i.e. the depending files.
The annoying part of this issue isn't that analysis is slow, but rather that editing is.
For instance, if I want to write a piece of code, most of the time writing it, the code will be in invalid state (have syntax or analysis errors). Due to this issue, the entire writing process will be slow.
However, if full analysis happened only after I completed the piece of code I wanted to write, which is presumably when it is free of syntax errors (or if we want to be more extreme all analysis errors), then I wouldn't mind the wait. I probably wouldn't even notice, since I'd be thinking what to write next.
Personally, I would probably use something like this even if this issue didn't exist. I'd love it if I could configure the analyzer to perform shallow analysis while editing a file and on file save perform a project-wide / deep analysis of how my edits affected other parts.
However, I don't know just how difficult something like this would be to implement. And if it is generally useful enough. And if the implementation is actually the incremental compilation being talked about in this issue.
To summarise: Currently, when editing a library two types of analysis is going on:
Analysis 1. provides the information required for completions and correctness checks in the library the user is writing. Analysis 2. informs the user of general program correctness.
However, only analysis 1. actually helps the developer when writing out a piece of library code. Analysis 2. ensures the entire program is correct but while editing, the developer might not care about this as much.
For instance, if I'm modifying a class, I'm not really thinking about all the places this class is being used. I'm just focused on writing out the code I need to write. And only after writing everything out, I'd like to know how my changes affect the program as a whole.
If this summary is correct (both technical and what developers value), it seems, the analyzer might be doing a lot of work for results that don't really help the programmer write new code. (If I'm editing a file and not looking at all the errors being highlighted in other files, I'm not leveraging the analyzer's full potential.)
So if we could delay analysis 2. until the user finishes, we might be able to make the editing as a whole smoother, even if the actual project wide analysis is still suboptimal for cases of large library cycles.
You are right that when you edit a library, the performance of editing (completion in particular) should not depend on how many uses of this library there are.
I have a couple of comments about this.
The hypothesis that "Analysis of libraries the current library uses" is fast might be false. I can imagine that if the current library is a part of the large library cycle, then in order to get the element model for this library cycle, so that we can resolve the current library, we need to parse and link all the libraries of this library cycle. We don't resolve method bodies, but still it is not necessary fast.
It is possible that because of the way scheduling work inside the Dart analysis server is implemented, we can pick up another, non-priority operation, e.g. analysis of a user of the current library after we finished with processing the current library. The user in the IDE might type one character, get completion, we start something else but heavy, the user types again, we don't respond because we are busy. Dart has only single thread of execution. We have an idea to push all non-priority work into a separate isolate, but again it is all limited by the availability of hands to do it.
Based on comments from @scheglov and @jakemac53, I was able to refractor our code base such that analyzer performance was greatly improved. Thank you for giving me the missing pieces and helping me solve our problem.
The key was in breaking the self-referencing library cycle.
In our case, this was achieved by moving configurations that pulled in most of the application aside and inject them into the main program rather than statically reference them across the application. (This means we provide the configuration to the application in main()
. This file is never used by other parts of the app so it's ok that it pulls in everything.)
We got to keep all the barrel files and the desired structure, we simply needed to shift some imports around.
Some libraries that many other libraries depend on still aren't as performant to edit as I'd like, however, the performance is miles better than what it was at the start of the refractor. (And in these cases I can still use the export comment out trick and the full performance is back again)
Just out of curiosity, are these analyzer limitations documented anywhere? I'd imagine if developers knew what patterns to avoid, they could leverage the analyzer much better.
Note that this issue is much more exaggerated when editing a macro - auto completion in the macro itself is always scheduled behind the re-compilation and re-application of the macro. This results in a poor experience when editing macros, even if they are only imported/applied in a couple libraries.
I don't know how to fix this other then somehow changing the scheduler of tasks in the analysis server such that events could have some sort of priority - and slower tasks that already started could yield to those higher priority events if they come in. However, that is likely a very large amount of work.
Note that this issue is much more exaggerated when editing a macro - auto completion in the macro itself is always scheduled behind the re-compilation and re-application of the macro.
I have to question how useful it is to a macro author to have the macro run while they're in the process of editing it. What's the motivation for that decision?
I don't know how to fix this other then somehow changing the scheduler of tasks in the analysis server such that events could have some sort of priority ...
Currently there isn't any explicit scheduler, the analysis server just processes requests as they arrive. This needs to change for other reasons, and the scheduler that we've designed ought to be able to handle at least part of this proposal.
The ability for lower priority tasks to yield to higher priority tasks is an interesting idea, but will also require some work on the "handlers" (the objects that perform the tasks). Fortunately, supporting some kind of yielding would be an incremental piece on top of the work we're already planning to do, so we can return to this issue after there's a scheduler in place.
@keertip
I have to question how useful it is to a macro author to have the macro run while they're in the process of editing it. What's the motivation for that decision?
When making a macro it is very handy to have the augmentation open for some library where the macro is applied, so you can see the actual output changing as you update the macro.
I think the current behavior is mostly just the result of not doing anything special as opposed to an explicit decision.
When making a macro it is very handy to have the augmentation open for some library where the macro is applied, so you can see the actual output changing as you update the macro.
Absolutely! But not when the code is incomplete and the results are known to not be interesting. Seems like it would be a better UX if there were a way to signal when it's useful to rerun the macro. I'm not sure what the right signal would be, but it sounds like compiling and executing the code less often might improve completion performance, which might produce a better UX overall.
Quite possibly even just only re-compiling/applying macros on save, instead of for each character typed, would be enough. If that is feasible to do.
We do get notifications when a file is saved, but many IDEs, including IntelliJ and VS Code, have an auto-save feature that makes 'saved' a less interesting signal.
We do get notifications when a file is saved, but many IDEs, including IntelliJ and VS Code, have an auto-save feature that makes 'saved' a less interesting signal.
Depends on how long the delay is... honestly even a very short delay here would dramatically improve the situation, so that if you are actively typing on a line you would get a more instant response for auto complete and such. Once you stop typing you are less likely to notice if the analyzer is doing a bunch of other work.
It isn't a perfect solution to be sure, but should be better than the status quo.
We could certainly try it and see whether it's enough of an improvement. (After we have the ability to schedule work, of course.)
Barrel files are files which repackage multiple libraries into a single library by exporting them. https://github.com/flutter/flutter/blob/master/packages/flutter/lib/material.dart is a good example of a barrel file. Developers usually introduce barrel files to streamline their import management.
Barrel files have an obvious draw back they introduce false dependency edges between libraries. Consider for example the code below:
In this situation when developer changes
b.dart
an incremental tooling which simply naively follows import edges will end up invalidating bothc.dart
andd.dart
.In the graph with many libraries this can have disastrous cascading consequences.
There is some evidence coming from users that eradicating barrel files considerably reduce reanalysis times for them. It would be nice to analyze this problem and understand what we can do to mitigate it.
This is both analyzer and CFE issue.