dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.06k stars 1.56k forks source link

Analyzer is very slow for large files #56247

Open liamappelbe opened 1 month ago

liamappelbe commented 1 month ago

When generating ObjC bindings using ffigen, it's quite common to have generated files with >10k, or even >100k lines. The analyzer does fine at these sizes, but I'm trying to build a large integration test that generates ~500k lines, and the analyzer starts to choke.

Lines of code Time to analyze
Small library 15k 1.8 sec
Medium library 277k 1.6 sec
Large library 469k 1.6 hours

The files are here. (Couldn't use a gist as they're too big)

bwilkerson commented 1 month ago

@scheglov @mraleph

Thank you for the test case!

srawlins commented 1 month ago

Hours makes this almost a P1. That's pretty wild. I wonder if @scheglov can do an initial diagnosis / triage.

liamappelbe commented 1 month ago

My guess is that it used up all the memory on my laptop and started thrashing like crazy.

scheglov commented 1 month ago

I cannot reproduce it. But this might be because I don't have all necessary libraries. I tried to pub the following pubspec.yaml and run flutter pub upgrade

name: issue56247

environment:
  sdk: ^3.5.0

dependencies:
  ffi: any
  objective_c: any

Which help a bit, but for large_library.dart there are still errors like (with many cuts)

/Users/scheglov/tmp/2024-07-17/issue56247
    /Users/scheglov/tmp/2024-07-17/issue56247/lib/large_library.dart
      errors:
        /Users/scheglov/tmp/2024-07-17/issue56247/lib/large_library.dart(846756..846778): Undefined class 'NSComparisonResult'.
        /Users/scheglov/tmp/2024-07-17/issue56247/lib/large_library.dart(5498764..5498787): Undefined class 'ObjCProtocolBuilder'.
        /Users/scheglov/tmp/2024-07-17/issue56247/lib/large_library.dart(5706255..5706281): The name 'NSFastEnumerationState' isn't a type, so it can't be used as a type argument.
        /Users/scheglov/tmp/2024-07-17/issue56247/lib/large_library.dart(5706748..5706771): Undefined class 'ObjCProtocolBuilder'.
        /Users/scheglov/tmp/2024-07-17/issue56247/lib/large_library.dart(5706828..5706854): The name 'NSFastEnumerationState' isn't a type, so it can't be used as a type argument.
        /Users/scheglov/tmp/2024-07-17/issue56247/lib/large_library.dart(5750087..5750129): Undefined class 'NSItemProviderRepresentationVisibility'.
<cut>

It takes 21467 ms to analyze, no cache.

The heap usage in DevTools does not look too bad.

image
liamappelbe commented 1 month ago

Which help a bit, but for large_library.dart there are still errors like (with many cuts)

Yeah, this generated code is using some unreleased features from package:objective_c. I plan to release it later this week, but if you want to try to repro now, add a git dep:

dependencies:
  objective_c:
    git:
      url: git@github.com:dart-lang/native.git
      path: pkgs/objective_c

There are 9 analysis errors when I run it locally (bugs in ffigen I haven't fixed yet).

scheglov commented 1 month ago

Indeed, only 9 errors, but the performance is still the same, 24357 ms with --observe, and the same heap usage.

mraleph commented 1 month ago

@liamappelbe can you still reproduce this? If you can we should figure out if you can capture some profiles for @scheglov.

liamappelbe commented 1 month ago

Yep, I can still repro. @scheglov just let me know the most useful thing I can capture for you.

scheglov commented 1 month ago

Here is the script that I used to measure timings, and to profile. To profile run it with --observe:5003 and open DevTools using the printed URI. There you can see heap, or CPU. If necessary, wrap the analysis into while (true) to see it continuously.

import 'package:analyzer/dart/analysis/results.dart';
import 'package:analyzer/file_system/overlay_file_system.dart';
import 'package:analyzer/file_system/physical_file_system.dart';
import 'package:analyzer/src/dart/analysis/analysis_context_collection.dart';

void main() async {
  var resourceProvider = OverlayResourceProvider(
    PhysicalResourceProvider.INSTANCE,
  );

  var collection = AnalysisContextCollectionImpl(
    resourceProvider: resourceProvider,
    includedPaths: [
      '/Users/scheglov/tmp/2024-07-17/issue56247/lib/large_library.dart',
    ],
  );

  var timer = Stopwatch()..start();
  for (var analysisContext in collection.contexts) {
    print(analysisContext.contextRoot.root.path);
    var analysisSession = analysisContext.currentSession;
    for (var path in analysisContext.contextRoot.analyzedFiles()) {
      if (path.endsWith('.dart')) {
        var libResult = await analysisSession.getResolvedLibrary(path);
        if (libResult is ResolvedLibraryResult) {
          for (var unitResult in libResult.units) {
            print('    ${unitResult.path}');
            var ep = '\n        ';
            print('      errors:$ep${unitResult.errors.join(ep)}');
          }
        }
      }
    }
  }
  print('[time: ${timer.elapsedMilliseconds} ms]');

  await collection.dispose();
}
liamappelbe commented 1 month ago

Tried running that script, and I think it is just memory thrashing. If I run it after a fresh reboot it runs in a reasonable amount of time (~30sec). But if I open a bunch of chrome tabs and code editor tabs etc, the analyzer takes forever. If I look at the observer on one of the slow runs, memory use is still about the same as what you're seeing (about 2 to 2.5GB), though the observer page runs very slowly, and the memory graph refreshes much less frequently.

There's probably not anything actionable here. When I run this integration test on github CI, I'll just have to make sure the bot has enough memory.