dotnet / sdk

Core functionality needed to create .NET Core projects, that is shared between Visual Studio and CLI
https://dot.net/core
MIT License
2.7k stars 1.06k forks source link

Need better logging for the native SDK resolver #15149

Open KirillOsenkov opened 3 years ago

KirillOsenkov commented 3 years ago

I don't see a way to debug/investigate the decision process of the SDK resolver native .dll.

I've gotten as far as https://github.com/dotnet/sdk/blob/917b98c1388c3fd7cc8fa5d2ee7f22ba01575747/src/Resolvers/Microsoft.DotNet.SdkResolver/NETCoreSdkResolverNativeWrapper.cs#L21-L23

but from then on it's a black box and I don't see anything that would explain the resulting SDK that comes out of it. I want to know what factors influenced the decision:

The resolver should explain its choice and clearly communicate which factors made a difference and which didn't. Otherwise I have to resort to cargo culting it and so far unsuccessfully.

KirillOsenkov commented 3 years ago

As an example, https://github.com/dotnet/msbuild/issues/5999 is an issue that took me a whole day to figure out, and as a result of that investigation I have emerged with the knowledge that I should have set these three environment variables to influence the SDK resolution process:

PATH=C:\msbuild\.dotnet;%PATH%
DOTNET_INSTALL_DIR=C:\msbuild\.dotnet
DOTNET_MULTILEVEL_LOOKUP=0

Had there been a log entry like "choosing 3.1.100 because the DOTNET_INSTALL_DIR was set explicitly and pointed to a 3.1.100 SDK at C:\msbuild.dotnet" I could have binged an entire season of something wholesome today instead of debugging deep into the SDK resolution.

Currently the only logging I see is image

KirillOsenkov commented 3 years ago

Correction, I also had to set PATH=C:\msbuild\.dotnet;%PATH%

rainersigwald commented 3 years ago

I don't know if the logging interface currently exposed by MSBuild makes this possible--it's fairly minimal. We might need to augment that first before improving the dotnet resolver.

dsplaisted commented 3 years ago

@KirillOsenkov Are you saying you need more visibility into what the native hostfxr_resolve_sdk2 method is doing, or what the managed code calling it does with the result? (Or are you not sure?)

KirillOsenkov commented 3 years ago

The former, I want to know what decisions the native hostfxr_resolve_sdk2 did and why. Because it's impossible to debug, all I get is a decision, but why it chose that SDK is completely unclear. It should say "used this environment variable" or something.

KirillOsenkov commented 3 years ago

And then I guess it's a separate story to thread that output through the managed code so it ends up as an Evaluation message in the binlog

dsplaisted commented 3 years ago

I think we can try to add some better logging in the managed SDK resolvers first. Then if we can get that piped through, we can try to add additional logging to the native methods to include.

KirillOsenkov commented 3 years ago

It’s a good plan

KirillOsenkov commented 3 years ago

Folks, this issue is supremely painful. Here's the error that I get:

Error MSB4236: The SDK 'Microsoft.NET.Sdk' specified could not be found.

I'm literally left stranded and helpless. Every time I investigate this, I end up spending a few hours debugging deep into the SDK, but if I'm a regular user, I'm just absolutely helpless. We really need to invest into improving this experience. Please @marcpopMSFT @KathleenDollard @dsplaisted

marcpopMSFT commented 3 years ago

FYI, it's not visible in GH but we've put this in our .net 6 backlog pipeline. That puts it behind our .net 6 committed work but ahead of the other 1600 issues we have. There is no commitment when we'll do this but we'll likely start looking at that list closer to the end of .net 6 timeframe and maybe slot in some of the items listed before .net 6 ships.

KirillOsenkov commented 3 years ago

As long as it’s on the radar and being tracked, I’m happy :)

benvillalobos commented 2 years ago

This bit me while figuring out https://github.com/dotnet/sdk/pull/24512. Logging the path it's trying to check would be a good next step.

KirillOsenkov commented 1 year ago

Wasted a couple hours of my life on this again today.

dsplaisted commented 1 year ago

26904 should help with this. It doesn't directly surface messages from the native resolver, but you can see a lot of what is going into it and coming back out.

It's also possible to set environment variables to get logs from the native resolver: https://learn.microsoft.com/en-us/dotnet/core/dependency-loading/default-probing#how-do-i-debug-the-probing-properties-construction

KirillOsenkov commented 1 year ago

It's great to have some developments here.

Having to set environment variables to obtain logs is a huge UX hurdle. Nobody will even know about these. Any reason these can't be on by default?

Also the last time I was looking, SdkLogger was not used for anything: https://github.com/dotnet/msbuild/issues/7988#issuecomment-1251782434

dsplaisted commented 1 year ago

I'd like to be able to include the messages from the native logger in the resolver failure logs. We'd need to update the native hostfxr code for that, and there might be a perf impact to always generating those messages.

For now we're hoping that the #26904 is a big step forward, and some time after that's gone in we can assess what further improvements we should make.

KirillOsenkov commented 1 year ago

Should I file a new issue to improve the error message for MSB4236?

Currently it says: error MSB4236: The SDK 'Microsoft.NET.Sdk' specified could not be found.

It doesn't say:

I can imagine a better text: error MSB4236: The SDK 'Microsoft.NET.Sdk' specified could not be found. global.json specifies '7.0.100' with rollForward: disable. The nearest installed SDK was 6.0.306. You can install the SDK from https://dot.net.

Also if possible consider grouping these and only show once instead of per-project (if building a solution). Chances are you will have one per project, so a hundred of these same errors for a hundred project solution.

Long term ideally there should be dotnet sdk restore that will automatically download and install the missing SDK. The error message could prompt then, or even have a mode where it does it automatically.

marcpopMSFT commented 1 year ago

@KirillOsenkov that error is for resolving the sdk that's at the top of a project file, not the .NET SDK. <Project Sdk="Microsoft.NET.Sdk">

Reading through the history here, I'm not sure if you're original issue was about finding the .NET SDK or resolving an sdk as later comments were more about the latter but your original description was about the former. The PR that Daniel linked will help with the latter.

KirillOsenkov commented 1 year ago

I didn’t realize they’re different! If I’m confused, imagine how our users will be confused. The error message needs to improve either way.

KirillOsenkov commented 7 months ago

@Forgind

KirillOsenkov commented 5 months ago

long sad story

KirillOsenkov commented 5 months ago

Got another difficult to diagnose issue today with this build failure:

myproject.csproj : error : Could not resolve SDK "Microsoft.NET.Sdk". Exactly one of the probing messages below indicates why we could not resolve the SDK. Investigate and resolve that message to correctly specify the SDK.
myproject.csproj : error :   The NuGetSdkResolver did not resolve this SDK because there was no version specified in the project or global.json.
myproject.csproj : error :   MSB4276: The default SDK resolver failed to resolve SDK "Microsoft.NET.Sdk" because directory "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Sdks\Microsoft.NET.Sdk\Sdk" did not exist.
myproject.csproj : error MSB4236: The SDK 'Microsoft.NET.Sdk' specified could not be found.

The error message incorrectly implies that the directory C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Sdks\Microsoft.NET.Sdk\Sdk should exist, whereas in reality the missing directory was likely C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Current\Bin\SdkResolvers\Microsoft.DotNet.MSBuildSdkResolver

KirillOsenkov commented 5 months ago

These should help:

set COREHOST_TRACE=1
set COREHOST_TRACE_VERBOSITY=4
set DOTNET_MSBUILD_SDK_RESOLVER_ENABLE_LOG=true