dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.23k stars 1.58k forks source link

A directory which is "excluded" via analysis options should not be searched for other analysis options. #56471

Open srawlins opened 2 months ago

srawlins commented 2 months ago

TL;DR What use case is there for excluding a directory from analysis, but desiring to still respect any analysis options files found inside an excluded directory?

This idea primarily comes from a chat @matanlurey and I had. That chat followed a chat thread with @bwilkerson, @keertip, @scheglov, @pq, and @jakemac53.

Matan notes that the Flutter engine repo was changed recently such that all third party code which is vendored in, is now located in the repository, at third_party/. It was previously located outside the repository, at ../third_party/. And it comes with a lot of static analysis reports, errors, warnings, TODOs, lints, etc. When Matan tried to exclude ./third_party from analysis, via the root analysis options, it didn't work. (This vendored directory is sort of a pub cache, managed by gclient rather than dart pub.)

We determined that this is sort of working as intended: the analyzer knows about the excluded files, and will not report static errors to an IDE (or stdout when invoked via dart analyze). But the analyzer still chooses to walk the directory tree in the analysis root, looking for analysis_options.yaml files (and I believe pubspec.yaml files and .dart_tool/package_config.json files), which all inform analyzer about how to analyze files in the tree under an analysis options files (and how to resolve code under a pubspec or package config file). Any such deep analysis options files do not inherit any options from a parent analysis options file (that only happens with an explicit include: field). The deep options are fresh, and may not exclude the same files. In the case of the Flutter engine repo, of course none of the analysis options files in third_party/ exclude their entire contents; it's just vendored code.

I think the team had all agreed on this intended functionality. But Matan and I then tried to figure out why that might be important.

The question is: what use case is there for excluding a directory from analysis, but desiring to still respect any analysis options files found inside an excluded directory?

Matan and I couldn't think of any. Certainly, for this case of a directory of vendored code, if it is acting like a pub cache, then you don't want to see any static issues contained (in order to follow our behavior of not analyzing code outside the analysis root, like code in the global pub cache). Even for the general case: if the developer has specified: "exclude directory X: don't analyze inside directory X" and then an analysis options file is found inside directory X (maybe deep), that doesn't mean we should start analyzing files there.

Not respecting analysis options files in excluded directories would be a breaking change, which I think is fine.

srawlins commented 2 months ago

@scheglov had pointed out an interesting implementation consideration: I think the gist was that today, when analyzer is opened at an analysis root, we do not need to open and understand any files in order to know where we are allowed to search for files. We just go forth.

With this new feature, it seems like we will need to walk directories BFS-style:

  1. Found a directory, does it have an analysis options file? a. Yes? Read it and parse it and determine any directories we should not walk into. b. No? Continue.
  2. Analyze each file in the directory (or whatever we do with files).
  3. For each directory in the directory, only recurse into it, with step 1, if it is not excluded. Hang onto all parent exclude lists as you recurse.

(Modulo excluding globs and files and whatever else we support.)

I think at a minimum we would also need to enforce that exclude: lists cannot start with ../ (or contain ../, whatever); it would get non-sensical if the analyzer was directed by a file to not read itself.