dotnet / msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
https://docs.microsoft.com/visualstudio/msbuild/msbuild
MIT License
5.22k stars 1.35k forks source link

Analyzers prototyping - Loading and acquisition #9633

Closed JanKrivanek closed 3 months ago

JanKrivanek commented 8 months ago

Context

9627

Built-in analyzers will be part of MSBuild (but separate assembly probably?), custom will be possible to plug-in post-installation. Analyzer management module will need to be able to discover and load analuzers - to provide them to the infractructure (that can then register and configure them based on the default and user configuration)

Things to be investigated and decided

rainersigwald commented 8 months ago

Brain dump

I think we agree that the ideal situation for .NET projects is to get MSBuild analzyers via NuGet package reference, just like acquiring additional build logic or Roslyn source analyzers/generators.

For C++ NuGet isn't pervasive so we might need to coordinate with them for their ideal desires.

NuGet can have assets in the project.assets.json file that get rendered into items at build time, and can also set up build logic that will be imported by the project evaluation that happens at build time.

Any registration of analyzers within the project will run into similar problems: some of the analysis rules we'd like to enable would be best run during evaluation--things like rules on condition expressions, rules about defining or not defining certain properties in certain locations, and so on. If we haven't discovered and enabled all of the relevant analyzers already before evaluation starts, we either can't have rules like that or have to unconditionally buffer all the relevant analysis state or events until we do, then replay them through the analyzers.

Roslyn analyzers don't have this problem because they can't be discovered within the compiler--they're discovered by the MSBuild layer and then a flat list is passed to the compiler at its initialization/command line start time.

That's an option for MSBuild analyzers too but has two big problems:

Analyzers could be entirely offline--working on a binlog replay, so you can assemble the analyzers during build somehow and then analyze afterward. This could even be transparent to the user (if build discovers analyzers automatically replay the binlog after the build completes but before we exit).

There could also be special handling for analyzers in NuGet restore--some kind of special file that we look at before starting to evaluate a project that has analyzer registrations in it. However, this would ideally be in obj, just like project.assets.json, and the location of obj is configurable via property--so we might have to (partially?) evaluate the project to find the analyzers then evaluate the project!

There's a possibility of making some kind of analyzer configuration note in memory during restore and using it during build, but that wouldn't work when the restore and build steps are in separate processes/invocations.

JanKrivanek commented 8 months ago

C++ is a great point! - I haven't thought about that much. Lets return to it once we have some stronger opinion on the net.sdk

What is your thoughts on offering standalone CLI for handling analyzers? Something along the lines of

> git clone <your repo> && git checkout <branch> && git pull
...
> dotnet build install-analyzer <package1> <package2> <package3>
...
> dotnet build <proj>

While not leveraging current execution paths (so partially 'reinventing the wheel'), it doesn't suffer the disadvantages of current restore execution path.

We can reuse (or even generalize and import) the logic and experience that's part of templating engine (dotnet new install, dotnet new search, dotnet new list) - allowing users to discover and use analyzers with low friction. We can still allow to reference them as standard nugets - but for the ones that need evaluation time context we can e.g. issue warning pointing the partiall blindness of the analyzer installed that way and option to overcome this with dedicate CLI.

All that wouldn't of course be a V1 thing.. but if seen as viable way, we'd know that for V1 we can rely on fixed location containing analyzers.

rainersigwald commented 8 months ago

As a flow I don't mind that too much but how would it work concretely? What would dotnet build install-analyzer do and how would subsequent dotnet build invocations pick it up?

YuliiaKovalova commented 8 months ago

As a flow I don't mind that too much but how would it work concretely? What would dotnet build install-analyzer do and how would subsequent dotnet build invocations pick it up?

dotnet build install-analyzer will allow to restore custom analyzers (nuget package?) - basically it downloads and unpacks them.

JanKrivanek commented 8 months ago

Maybe first 1 important question I didn't put nowhere in this item or discussion: Where can we afford to have the declaration of used analyzer?

I guess you know where I'm heading :-) - the acquisition as part of restore will bring complications - can we avoid this alltogether? That's how I was imagining the separate CLI would work

If we absolutely need to follow the suit of PackageReferences - then we're brainstorming with @YuliiaKovalova a two-staged acquisition:

It would be nice to brainstorm (not necessarily stick to) some solution(s), that do not require sacrifice in a form of over-buffering or over-emitting data, because we do not know yet if they will be needed at some point.

rainersigwald commented 8 months ago

Can we afford to have something of our own?

I would rate this as a big con of an approach that requires it, but not necessarily rule it out entirely.

rainersigwald commented 8 months ago

basically it downloads and unpacks them.

That part makes sense, but how does a subsequent build invocation know where they are and that they should be hooked up?

YuliiaKovalova commented 8 months ago

basically it downloads and unpacks them.

That part makes sense, but how does a subsequent build invocation know where they are and that they should be hooked up?

Options provided by @JanKrivanek:

  1. Hardcoded well known folder (e.g. '.analyzers' next to projectfile or in any folder up the folder structure - similar to how .git, .editorconfig etc. work), everything in that folder is loaded;
  2. Custom configuration file (e.g. analyzers.json) - again, similar location logic as .editorconfig;
  3. Mentioned in msbuild project files (same as roslyn)
KalleOlaviNiemitalo commented 8 months ago

Could it be a command-line option read from Directory.Build.rsp? I expect this would be read early enough. MSBuild apparently supports %MSBuildThisFileDirectory% in there (implementation PR https://github.com/dotnet/msbuild/pull/4499, documentation request https://github.com/MicrosoftDocs/visualstudio-docs/issues/9971), so the option could even reference an analyzer at a directory relative to the location of Directory.Build.rsp.

YuliiaKovalova commented 7 months ago

Existing constraints:

  1. VS doesn't rely on the command line, and we may need to provide an alternative way of communication if analysis is required for the build.
  2. Modifying the Restore target can be problematic since it is a feature request for the NuGet team.
  3. The existing mechanism adds information about packages to the .props file in the obj folder too late; some evaluation data may not be available anymore.
  4. Creating a separate CLI is conforming to the current standard setup by Roslyn analyzers.
  5. Hooking using .rsp file is possible, but analyzer packages may not be restored yet. .rsp capabilities are restricted now and don't allow to do package restore.
  6. Adding sources to the repo (the plan to have a folder with all rules) isn't a good practice and can cause pushback from customers (perhaps we can put the sources in the .vs folder).
  7. Currently, handling Roslyn rules is managed by the SDK itself, which goes through package.json files and searches for dedicated packages using pattern matching - we don't plan to rely on this experience.

Subproblems to the mentioned above: 1 - "checked into the repo" == how do we express what analyzers are wanted (probably the package reference, but would we need something else as well?). 2 - "at or after restore" == this is the part where we'll need to somewhere (obj? .vs? elsewhere?) store information that analyzers are wanted and where to find them. 3 - "for each analyzed build invocation" == this is the part where we need to be able to 'wake up analyzer' and if we detect that it's not woken up early enough, then give some meaningful error with good workaround.

KalleOlaviNiemitalo commented 7 months ago

If MSBuild is building multiple projects in the same invocation (via the MSBuild task, or a graph build ProjectReference), then is it desired that each project be able to have different analyzers?