microsoft / MSBuildLocator

An API to locate MSBuild assemblies from an installed Visual Studio location. Use this to ensure that calling the MSBuild API will use the same toolset that a build from Visual Studio or msbuild.exe would.
Other
212 stars 83 forks source link

Add MSBuildLocator.RegisterLatest method #184

Closed yaakov-h closed 4 months ago

yaakov-h commented 1 year ago

MSBuildLocator.RegisterDefault() is recommended by the Microsoft documentation but is problematic as it simply takes the FirstOrDefault() instance.

Exactly which instance is the FirstOrDefault() is not particularly clear to developers when there is more than once instance of Visual Studio installed on the computer.

Instances appear to be returned by the Visual Studio Setup API in alphabetical order, which is the order that Windows returns when enumerating the C:\ProgramData\Microsoft\VisualStudio\Packages\_Instances directory.

Therefore, the instance registered by RegisterDefault() is not neccesarily the oldest, nor the newest, nor the one that was installed first, nor the one that was installed most recently. Unless there is some determinism to the instance ID that I am not aware of, it appears to be randomly generated and thus entirely down to luck.

At my company we have a large number of machines with both VS2019 and VS2022, and can get inconsistent results across different identically-configured machines which can cause crashes and errors that are particularly difficult to diagnose. As a workaround, in some of our tools we have been manually registering the instance sorted by version number, to great success.

In this PR I have added a new method, RegisterLatest(), which enumerates all of the instances and registers the one with the highest version number. This will provide a far greater level of determinism and consistency, and gives the user some level of control over which version is actually selected.

ghost commented 1 year ago

CLA assistant check
All CLA requirements met.

rainersigwald commented 1 year ago

MSBuildLocator.RegisterDefault() is recommended by the Microsoft documentation

Where? We should fix that immediately.


I wonder if we should consider changing RegisterDefault to do this, instead.

I regret ever making RegisterDefault. I would like to remove it in an upcoming release--it does a very poor job of hiding the complexity of selecting an instance, and I think it'd be better to push that directly to the Locator-using developer. I don't think RegisterLatest really helps much on that front, because it doesn't account for workloads or possible version incompatibilities.

yaakov-h commented 1 year ago

Where? We should fix that immediately.

In the code blocks here: https://learn.microsoft.com/en-us/visualstudio/msbuild/updating-an-existing-application?view=vs-2022#register-instance-before-calling-msbuild

It also states (bold emphasis mine):

If you would like finer-grained control over the loading of MSBuild, you can select a result of MSBuildLocator.QueryVisualStudioInstances() to pass to MSBuildLocator.RegisterInstance() manually, but this is generally not needed.

Additionally, the commits linked from here as sample code show the use of RegisterDefault.


I wouldn't mind changing or removing RegisterDefault tbh. That would solve #81 too, and if a new interface is added that exposes the underlying VS components/workloads available that could also solve #22/#109.

jzabroski commented 1 year ago

@rainersigwald I created a documentation issue for your "fix immediately" suggestion.

In the mean time, I am going top submit a fix for this behavior, which I expect to supercede @yaakov-h PR, since this awful default behavior wasted my Saturday morning debugging docfx.console spurious behavior

MSBuild 18 should ship with this behavior by default. If .NET 7 doesn't ship with this behavior, it is a mistake, in my eyes.

jzabroski commented 1 year ago

Some thoughts:

  1. What does "latest" mean?
    1. In the nuget world, there is latest including prereleases, and there is latest stable. I dont think MSBuildLocator has that meaning.
    2. Sam Harwell and Jason Malinowski likely prefer latest prerelease for Roslyn, for example;
    3. Whereas most users of docfx.console actually probably prefer latest stable;
    4. Yet others probably just prefer a pinned version (which would be very hard to support due to packages continually shifting from out-of-box on nuget.org to part of framework core libs and the general lack of process isolation between MSBuild and MSBuild Tasks)
  2. The current code is hard to read because it uses a lot of compile-time symbols to determine behavior. see below for examples
    1. There appears to be some implicit depending between symbols NET46 and FEATURE_VISUALSTUDIOSETUP but it's not clear what that dependency is or how it can be "constructed" in a real life OS / framework installation

https://github.com/microsoft/MSBuildLocator/blob/master/src/MSBuildLocator/MSBuildLocator.cs#L354-L371

https://github.com/microsoft/MSBuildLocator/blob/master/src/MSBuildLocator/VisualStudioInstanceQueryOptions.cs#L18-L24

Forgind commented 1 year ago

Some thoughts:

  1. What does "latest" mean?

    1. In the nuget world, there is latest including prereleases, and there is latest stable. I dont think MSBuildLocator has that meaning.
    2. Sam Harwell and Jason Malinowski likely prefer latest prerelease for Roslyn, for example;
    3. Whereas most users of docfx.console actually probably prefer latest stable;
    4. Yet others probably just prefer a pinned version (which would be very hard to support due to packages continually shifting from out-of-box on nuget.org to part of framework core libs and the general lack of process isolation between MSBuild and MSBuild Tasks)

I think this really gets to the heart of our problem with this. Most people assume the "default" thing should do something sensible, and only advanced users would need something more complicated, but what the default should be (and even what "latest" means) are both up for debate with valid arguments on both sides. There are even more complicated ways we could try to assume default. Perhaps it should be the most recently updated VS, since that's the one the user most likely cares about. Perhaps it should be the "latest" VS enterprise, only falling back to one of the others if no enterprise is present. Perhaps it should be the be the VS most recently opened. There are valid arguments in favor of any of these—admittedly some stronger than others—and "default" just doesn't disambiguate. I would argue that a change like the one Jason recommended would be more rational, but it would also break some people, and we try to avoid that. Ultimately, if we can remove RegisterDefault entirely and force people to choose the MSBuild they want, we can avoid unfortunately scenarios with MSBuildLocator finding an unexpected MSBuild.

I, at least, am not currently brave enough to pull the trigger on that, even with intermediate steps like deprecating it first. But I do see other changes here as potentially wasted work if we end up throwing out RegisterDefault entirely.

jzabroski commented 1 year ago

There are valid arguments in favor of any of these—admittedly some stronger than others—and "default" just doesn't disambiguate.

Can we start by simply documenting how we think the logic works today? You can glean from Jason and Sam's comments that it was too complex for them to waste time trying to understand, and they literally claimed they did not understand it.

For example, I don't even understand what "Dev Console" is supposed to mean. Someone who is a long tenured MS emlpoyee like @rainersigwald may know for sure.

For example, the %VSINSTALLDIR% environment variable apparently is no longer used in Visual Studio 2022. https://github.com/MicrosoftDocs/visualstudio-docs/issues/7774

Do you guys get the point I am drawing about what a giant rat's nest of assumptions this is? The way I would start to untangle this is figure out which versions of Visual Studio even support some of these CMD variables:

  1. %VSINSTALLDIR%
  2. %VSCMD_VER%
  3. %VisualStudioVersion%

And then explain this comment, because I don't understand why this PR even exists if this comment accurately describes what should happen:

https://github.com/microsoft/MSBuildLocator/blob/850d27b4ff8253d017ae4968154febc33c1db529/src/MSBuildLocator/DotNetSdkLocationHelper.cs#L145-L148

rainersigwald commented 1 year ago

For example, I don't even understand what "Dev Console" is supposed to mean. Someone who is a long tenured MS emlpoyee like @rainersigwald may know for sure.

"Dev Console" is a synonym for "Developer Command Prompt"

For example, the %VSINSTALLDIR% environment variable apparently is no longer used in Visual Studio 2022. MicrosoftDocs/visualstudio-docs#7774

This isn't true:

**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.3.6
** Copyright (c) 2022 Microsoft Corporation
**********************************************************************

C:\Program Files\Microsoft Visual Studio\2022\Enterprise>set vs
VS170COMNTOOLS=C:\Program Files\Microsoft Visual Studio\2022\Enterprise\Common7\Tools\
VSCMD_ARG_app_plat=Desktop
VSCMD_ARG_HOST_ARCH=x86
VSCMD_ARG_TGT_ARCH=x86
VSCMD_VER=17.3.6
VSINSTALLDIR=C:\Program Files\Microsoft Visual Studio\2022\Enterprise\
rainersigwald commented 1 year ago

And then explain this comment, because I don't understand why this PR even exists if this comment accurately describes what should happen:

That code only applies when resolving MSBuild from a .NET SDK, not from a Visual Studio installation.

jzabroski commented 1 year ago

That code only applies when resolving MSBuild from a .NET SDK, not from a Visual Studio installation.

My point was that the behavior should be consistent across MSBuildLocator.

nuget.exe does not resolve packages differently whether I use ProGet, MyGet or nuget.org.

rainersigwald commented 1 year ago

My point was that the behavior should be consistent across MSBuildLocator.

This is not obvious to me. Please elaborate: why should we break detection of Visual Studio within a specific VS installation's command prompt for consistency with what dotnet build would have done?

jzabroski commented 1 year ago

Please elaborate: why should we break detection of Visual Studio within a specific VS installation's command prompt for consistency with what dotnet build would have done?

Good point. So, in other words, in the use cases I laid out above, I ignored the fact that the primary users of RegisterDefaults is actually Visual Studio itself as a use case.

Visual Studio itself as a (primary) use case was not on my mind when I was thinking through when this problem hits me, since it normally hits me from a build pipeline (like docfx.console or some other tool) or command line shell (where I am prototyping what to put into a build pipeline).

My main thought process was to guide towards a more deterministic version of MSBuildLocator, with a clear specification for how it should behave. I thought having a nuget.exe-style approach to pulling in references by lowest version specified would be optimal. Lowest known version avoids zero day exploits and possible unintended regressions, in addition to promoting deterministic behavior. However, in building my analogy, I overlooked that in the case of Visual Studio calling RegisterDefaults() itself, it is acting like a user customizing their nuget package versions with a particular valid range.

yaakov-h commented 1 year ago

This is probably not the right thread to have that discussion, but MSBuildLocator's internal behaviour varies wildly depending on which assembly you use.

The .NET Framework build looks for Visual Studio or environment variables set by the Developer Console in order to load a .NET Framework version of MSBuild out of Visual Studio.

The .NET Core build looks for the .NET SDK in order to load a .NET Core version of MSBuild out of the SDK.

This is indicated here and a handful of other places throughout: https://github.com/microsoft/MSBuildLocator/blob/d83904bff187ce8245f430b93e8b5fbfefb6beef/src/MSBuildLocator/VisualStudioInstanceQueryOptions.cs#L19-L24

It's possible that the .NET Core code is more deterministic as you've shown above, but the Visual Studio locator code is as described in the OP.

Potentially this issue could also be fixed by making VisualStudioLocationHelper.GetInstances perform a Sort() before returning the compiled list of Visual Studio instances.

jzabroski commented 1 year ago

It's possible that the .NET Core code is more deterministic as you've shown above, but the Visual Studio locator code is as described in the OP. -- @yaakov-h

It's a good point. I think you actually hit the nail on the head in your first post:

Unless there is some determinism to the instance ID that I am not aware of, it appears to be randomly generated and thus entirely down to luck. -- @yaakov-h

  1. RegisterDefaults should be removed ASAP. It is random.
  2. We need a spec for:
    1. How to find Visual Studio; as of Visual Studio 2017 or later, vswhere.exe provides a nice json format and addresses part of the stable vs. prerelease issue I describe; see https://learn.microsoft.com/en-us/visualstudio/install/tools-for-managing-visual-studio-instances?view=vs-2022 for inspiration
    2. Selecting a Visual Studio flavor to determine which MSBuild to use if multiple installs of Visual Studio are present - this should be fairly straightforward, although @Forgind did a nice job explaining that Enterprise may be preferred to Community when both are installed, etc.

BTW, @rainersigwald @Forgind - this type of issue would likely have been caught eventually via docker headless vs installer integration tests, as each CI run would/should generate different Instance IDs in the Windows Registry.

jzabroski commented 1 year ago

Potentially this issue could also be fixed by making VisualStudioLocationHelper.GetInstances perform a Sort() before returning the compiled list of Visual Studio instances.

You would still need to explain how the Sort should behave.

yaakov-h commented 1 year ago

How to find Visual Studio; as of Visual Studio 2017 or later, vswhere.exe provides a nice json format and addresses part of the stable vs. prerelease issue I describe...

vswhere is simply a tool that the caller can pass a version range or prerelease options or a required components list to or so on. However, on the very page you've listed, it also lists:

The Setup Configuration API provides interfaces for developers who want to build their own utilities for interrogating Visual Studio instances.

This is the API that MSBuildLocator uses, and vswhere uses this internally as well. Chances are vswhere is also susceptible to the same alphabetical-by-instance-ID problem.

However, it is still up to the thing calling the API (i.e. MSBuildLocator, or we can hand that responsibility off to MSBuildLocator's consumer) to filter and sort for the instance you want, and that goes whether an application goes through the COM API or whether it uses vswhere, the same problem still applies.

You would still need to explain how the Sort should behave.

Same thing I've done in this PR and mentioned in the OP - take the latest by version number. If you use a Preview then you get to uncover any bugs or compatibility issues with the Preview, if you use Stable then hopefully it will Just Work. If you want to make your own decisions, don't use RegisterDefaults or RegisterLatest, and instead pick one yourself, as I've had to do for the last 5 years.

Or, as @rainersigwald suggested, remove both APIs and force consumers to make their own decision each and every time. In which case I'd be tempted to publish an opinionated SimpleMSBuildLocator package to wrap it 😄

jzabroski commented 1 year ago

The Setup Configuration API provides interfaces for developers who want to build their own utilities for interrogating Visual Studio instances.

This is the API that MSBuildLocator uses, and vswhere uses this internally as well. Chances are vswhere is also susceptible to the same alphabetical-by-instance-ID problem.

I did not know that; I did not realize the API provides isPrerelease, which vswhere.exe provides. How do I get that flag?

Or, as @rainersigwald suggested, remove both APIs and force consumers to make their own decision each and every time. In which case I'd be tempted to publish an opinionated SimpleMSBuildLocator package to wrap it 😄

Cheeky, but does not address my observation about IsPrerelease. Ideally SimpleMSBuildLocator would expose such flags.

yaakov-h commented 1 year ago

@jzabroski The flag is exposed by ISetupInstanceCatalog.IsPrerelease(), which I use as:

ISetupInstance2 i2 = /* I assume you already have your instance from the API */;

if (i2 is ISetupInstanceCatalog catalog && catalog.IsPrerelease())
{
    // pre-release build
}
jzabroski commented 1 year ago

MSBuildLocator exposes public class VisualStudioInstance; IsPrerelease should be a property on that MSBuildLocator class. FYI - The MSBuildLocator logic itself notes that some of this logic is copy-pasted from MSBuild, and it links to a Microsoft code sample URL that no longer exists. It also says the logic is "non-obvious" (although, to be honest, I don't think it is that bad - it's nasty, but stepping through it, you can figure out what it does).

YuliiaKovalova commented 4 months ago

This PR seems to be outdated. Feel free to reopen it if you still need the suggested changes. Thank you!