dotnet / msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
https://docs.microsoft.com/visualstudio/msbuild/msbuild
MIT License
5.21k stars 1.35k forks source link

GeneratedInternalTypeHelper.g.cs Causes Rebuild Of Solution When Opening Visual Studio But Not in MSBuild #4217

Open aolszowka opened 5 years ago

aolszowka commented 5 years ago

Hi,

I am currently investigating why we get full builds on some of our larger solution files (500+ CSPROJ).

We are currently using Visual Studio 2017 15.9.6

Goals

Here is what we are trying to do:

  1. We want to be able to invoke MSBuild from the Command Line to build our solutions (this way we can script this to happen in the morning before a developer comes in).
  2. With NO changes we expect to be able to open the Solution File in Visual Studio and hit Build and have nothing build (since everything should be up to date).

In this way our Developers can be productive right when they get in instead of waiting on a build.

Unfortunately as it stands today Visual Studio feels like it needs to rebuild a significant amount of the code base.

Background

We have already encountered and corrected the following issues:

  1. The State Files attempt to write to paths that were too long. We would constantly get MSB3101: Could not write state file errors. We had these as relative paths but unbeknown to us apparently MSBuild does not "expand" this path prior to passing it off; this actually resulted in paths that were even LONGER than we expected. We worked around that by setting our IntermediateOutputPath to be explicitly expanded by MSBuild prior to passing it off. Like so:
<IntermediateOutputPath>$([System.IO.Path]::GetFullPath('$(MSBuildProjectDirectory)\..\..\..\..\..\..\bin\obj\$(ProjectGuid)'))</IntermediateOutputPath>

This seems to have resolved that issue and we no longer encounter the warnings and incremental builds seem to work. I have not seen any issues reported for this; but I did find this StackOverflow post which led me in the right direction: https://stackoverflow.com/questions/139964/msbuild-directory-structure-limit-workarounds/9635709#9635709 Its unclear if this is a known issue, an issue that will go away once we get Long Path Support, or an issue that should be reported as a new bug.

  1. We then encountered Issue #1648 wherein when Visual Studio would launch it would then create the 3 Temporary generated files in the \bin\obj\ folder even when we were not utilizing Workflow; this of course triggered a rebuild. We have worked around that by inserting into every single project a no-op like so:

    <Target Name="GenerateCompiledExpressionsTempFile">
    <!--This is a no-op to overwrite the existing target that ships with MSBuild. When we upgrade to Visual Studio 2019 (MSBuild 16.0) we can remove this. See https://github.com/Microsoft/msbuild/issues/1648-->
    </Target>
  2. We identified several locations in our code base where we had <CopyToOutputDirectory>Always</CopyToOutputDirectory> this seemed to trigger builds; we have converted all of these to <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> which again seems to have reduced the copying. This seems to have been documented in several places on the internet; one such place is https://blogs.msdn.microsoft.com/kirillosenkov/2014/08/04/how-to-investigate-rebuilding-in-visual-studio-when-nothing-has-changed/.

Current Roadblock (This Report)

With these changes we have been able to drive down our solution rebuilds from 900 projects down to 140 Projects; all of which are Xaml Projects or have an N-Order Dependency on a shared library which contains Xaml code.

Using the diagnostic verbosity level we can see the following from within Visual Studio:

Project 'CU.TestClient.Core' is not up to date. Input file 's:\timssvn\6x\trunk\bin\obj\{5b44a2e5-81f9-4d9d-9857-006b76d62f8c}\generatedinternaltypehelper.g.cs' is modified after output file ''.

When we build via the Command Line in MSBuild in MSBuild we see that the file is indeed created; however the file is empty. When we run this build in diagnostic verbosity we see the following:

InternalTypeHelper class is not required for this project, make file 'S:\TimsSVN\6x\Trunk\bin\obj\{5B44A2E5-81F9-4D9D-9857-006B76D62F8C}\GeneratedInternalTypeHelper.g.cs' empty.

This traces back to this section of logic in MarkupCompilePass2.cs: https://referencesource.microsoft.com/#PresentationBuildTasks/BuildTasks/Microsoft/Build/Tasks/Windows/MarkupCompilePass2.cs,684

//
// If no any xaml file with local-types wants to reference an internal type from
// current assembly and friend assembly, and InternalTypeHelperFile is set in the
// cache file, now it is the time to remove the content of InternalTypeHelper File.
//
// We still keep the empty file to make other parts of the build system happy.
//
if (!String.IsNullOrEmpty(_internalTypeHelperFile) && !compilerWrapper.HasInternals)
{
    if (TaskFileService.Exists(_internalTypeHelperFile))
    {
        // Make empty content for this file.

        MemoryStream memStream = new MemoryStream();

        using (StreamWriter writer = new StreamWriter(memStream, new UTF8Encoding(false)))
        {
            writer.WriteLine(String.Empty);
            writer.Flush();
            TaskFileService.WriteFile(memStream.ToArray(), _internalTypeHelperFile);
        }

        Log.LogMessageFromResources(MessageImportance.Low, SRID.InternalTypeHelperNotRequired, _internalTypeHelperFile);
    }
}

Therefore the file is indeed empty. HOWEVER when the solution is loaded up in Visual Studio (on launch; not even hitting Build or in any way interacting with the IDE beyond opening) we see:

The Code within these files is:

//------------------------------------------------------------------------------
// <auto-generated>
//     This code was generated by a tool.
//     Runtime Version:4.0.30319.42000
//
//     Changes to this file may cause incorrect behavior and will be lost if
//     the code is regenerated.
// </auto-generated>
//------------------------------------------------------------------------------

namespace XamlGeneratedNamespace {

    /// <summary>
    /// GeneratedInternalTypeHelper
    /// </summary>
    [System.Diagnostics.DebuggerNonUserCodeAttribute()]
    [System.CodeDom.Compiler.GeneratedCodeAttribute("PresentationBuildTasks", "4.0.0.0")]
    [System.ComponentModel.EditorBrowsableAttribute(System.ComponentModel.EditorBrowsableState.Never)]
    public sealed class GeneratedInternalTypeHelper : System.Windows.Markup.InternalTypeHelper {

        /// <summary>
        /// CreateInstance
        /// </summary>
        protected override object CreateInstance(System.Type type, System.Globalization.CultureInfo culture) {
            return System.Activator.CreateInstance(type, ((System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.NonPublic) 
                            | (System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.CreateInstance)), null, null, culture);
        }

        /// <summary>
        /// GetPropertyValue
        /// </summary>
        protected override object GetPropertyValue(System.Reflection.PropertyInfo propertyInfo, object target, System.Globalization.CultureInfo culture) {
            return propertyInfo.GetValue(target, System.Reflection.BindingFlags.Default, null, null, culture);
        }

        /// <summary>
        /// SetPropertyValue
        /// </summary>
        protected override void SetPropertyValue(System.Reflection.PropertyInfo propertyInfo, object target, object value, System.Globalization.CultureInfo culture) {
            propertyInfo.SetValue(target, value, System.Reflection.BindingFlags.Default, null, null, culture);
        }

        /// <summary>
        /// CreateDelegate
        /// </summary>
        protected override System.Delegate CreateDelegate(System.Type delegateType, object target, string handler) {
            return ((System.Delegate)(target.GetType().InvokeMember("_CreateDelegate", (System.Reflection.BindingFlags.InvokeMethod 
                            | (System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance)), null, target, new object[] {
                        delegateType,
                        handler}, null)));
        }

        /// <summary>
        /// AddEventHandler
        /// </summary>
        protected override void AddEventHandler(System.Reflection.EventInfo eventInfo, object target, System.Delegate handler) {
            eventInfo.AddEventHandler(target, handler);
        }
    }
}

Reading the source to MarkupCompilePass2 it appears that the g.i.cs file is generated for use by Intellisense and is not put into play with the build (as is expected) but for some reason Visual Studio felt that it needed to regenerate the g.cs file.

This is where I am stuck; I am not smart enough to understand the scenarios in which the MarkupCompilePass2 will require itself to build the g.cs file. I have tried to narrow down if it is a project or solution file causing the issue (to try and get a smaller repo) but I have been unsuccessful.

When I have attempted to reduce it down to just this project I am unable to get the same behavior; with smaller solutions I will often get this behavior:

In this cases the incremental build obviously works because g.cs was not touched.

Can anyone provide pointers as to where I can go to continue to debug this? I am currently stuck and unsure how to debug further.

Questions

I am dedicated to identifying this issue and am pretty resourceful if pointed in the right direction.

Thank you

rainersigwald commented 5 years ago

I've forwarded this internally to a couple of folks I hope can help.

  • What in the Visual Studio Startup Process is muxing the generatedinternaltypehelper files?

That's a Design-Time build, which uses MSBuild but can set various properties, call different targets, and set "host objects" that cause tasks to behave differently.

  • How can I debug that above process?

There's a document on debugging design-time builds in the dotnet/project-system repo. For these projects, I think you'll want to use the TRACEDESIGNTIME option, rather than the newer options.

aolszowka commented 5 years ago

@rainersigwald this log is very useful (using the TRACEDESIGNTIME option); however it appears that it is being invoked in Parallel; is it possible to have this invoked in sequence? Also is there a way to control the verbosity? I see it mentioned for the other options that it is respected from that dialog (and honestly it looks like it might be? but at very least the number of processors is not). I do not see the Project '{EXT}' is not up to date messages in this log; but searching though I do see CoreCompile get called several times with and output directory that would modify the remaining projects I believe.

Is there something I am missing?

rainersigwald commented 5 years ago

I expected it to follow the Tools -> Options -> Build and Run -> maximum number of parallel builds setting, but it sounds like you've tried that and it's not working. Maybe try running VS under start /AFFINITY restricted to one processor?

I do not see the Project '{EXT}' is not up to date messages in this log

Yes, this hooks up an MSBuild logger to the MSBuild invocations, but those messages are generated by the project system's "Fast up-to-date check" which is outside MSBuild and thus doesn't get logged to the same place (it should still be in the build output window).

aolszowka commented 5 years ago

@rainersigwald that did not seem to work either (neither the AFFINITY nor Maximum number of parallel builds).

I have been able to make progress though; I am able to get it to reproduce using a smaller project (but still ~400 projects). Using the MSBuild Structured Log Viewer I can tell that DesignTimeMarkupCompilation is getting ran on launch:

image

I think there is something slightly wrong in the logic for MarkupCompilePass1.cs; it would be extremely helpful if I could debug into this code and evalute its Execute Method (https://referencesource.microsoft.com/#PresentationBuildTasks/BuildTasks/Microsoft/Build/Tasks/Windows/MarkupCompilePass1.cs,108)

Is there a way I can hook into this and step through?

aolszowka commented 5 years ago

I have made further progress for anyone following along at home.

I feel there has to be an easier way to debug this because this is extremely complex; however I am starting to dig into this.

I am getting burned by devenv throwing a "Stack cookie instrumentation code detected a stack-based buffer overrun." we usually see these when bad things are about to happen. However due to the fact that I am debugging I am leery to waste engineering effort chasing ghosts. I do have a dump I am willing to share (we're setup to capture dumps via ProcDump) if someone cares enough to have me track on this. Here's a Gist with the stack trace: https://gist.github.com/aolszowka/5331073234dc13fe4374272a74b4f54b

aolszowka commented 5 years ago

Progress

In Visual Studio on launch in the Design Time Build MarkupCompilePass1 is called. Within it AnalyzeInputsAndSettings() is invoked (https://referencesource.microsoft.com/#PresentationBuildTasks/BuildTasks/Microsoft/Build/Tasks/Windows/MarkupCompilePass1.cs,1103) in this method _referencesCache is recalculated using CompilerState.GenerateCacheForFileList(ITaskItem[]) (https://referencesource.microsoft.com/#PresentationBuildTasks/BuildTasks/MS/Internal/Tasks/CompilerState.cs,211) this is where the first divergence occurs.

This is important because this information is calculated and compared to the cache file within the same AnalyzeInputsAndSettings() method by calling the IncrementialCompilerAnalyzer.AnalyzerInputFiles() https://referencesource.microsoft.com/#PresentationBuildTasks/BuildTasks/MS/Internal/Tasks/IncrementalCompileAnalyzer.cs,102

Here is the code in question:

        //
        // Generate cache string for item lists such as PageMarkup, References, 
        // ContentFiles and CodeFiles etc.
        //
        internal static string GenerateCacheForFileList(ITaskItem[] fileItemList)
        {
            string cacheString = String.Empty;

            if (fileItemList != null && fileItemList.Length > 0)
            {
                int iHashCode = 0;

                int iCount = fileItemList.Length;

                for (int i = 0; i < iCount; i++)
                {
                    iHashCode += fileItemList[i].ItemSpec.GetHashCode();
                }

                StringBuilder sb = new StringBuilder();

                sb.Append(iCount);
                sb.Append(iHashCode);

                cacheString = sb.ToString();
            }

            return cacheString;

        }

A quick reading of the code shows that you will overflow that int very quickly when suming the HashCodes; however because we are in an unchecked context this will simply truncate.

In MSBuild the result of this call (for our project inputs) is: 18275237762 In Visual Studio the result of this call (for our project inputs) is: 181871276592

So we know that at least MSBuild and Visual Studio are getting the same number of files (18; as this is the first part of the cache string) however they are generating different HashCodes for the Input Items.

Unfortunately I am unable to see what ItemSpec is as this is optimized:

image

I am at least able to see what the FIRST hash is reliably (since it is assigned to iHashCode which is available to me). It appears that the Visual Studio version is simply taking the Hash of the file path; however MSBuild gets a completely different Hash.

File VS ItemSpec.GetHashCode() MSB ItemSpec.GetHashCode()
S:\TimsSVN\6x\Trunk\Bin\Client\ComputersUnlimited.Fusion.PlatformProxies.dll 450276713 -1712272455

Question

Is there any other way to get the HashCode of this ItemSpec item from within MSBuild? It will help point out why Visual Studio's DesignTime Compile and MSBuild's compile differ.

aolszowka commented 5 years ago

So the Hashcode Mismatches are a BUST.

ItemSpec is indeed a string; the difference in GetHashCode() was because in running from the command line we use MSBuild64 in order to address more than 4gb of RAM per MSBuild Process. While the result is consistent from run to run on the same platform; changing platforms has no such guarantee https://stackoverflow.com/questions/4466132/string-gethashcode-returns-different-values this is a good lesson to remember when implementing a hashing mechanism to remember not to rely on implementation details for your own caching logic.

Getting past that we still end up building but for a different reason; this time it is a difference in the source code files https://referencesource.microsoft.com/#PresentationBuildTasks/BuildTasks/Microsoft/Build/Tasks/Windows/MarkupCompilePass1.cs,1109

When ran via MSBuild this list (for our inputs) produces 75 items; however when ran via Visual Studio this list produces 76 items.

The difference in Visual Studio is that this item is getting added:

C:\Users\aceo\AppData\Local\Temp\2\.NETFramework,Version=v4.7.2.AssemblyAttributes.cs

I am continuing to track.

aolszowka commented 5 years ago

The AssemblyAttributes.cs is being generated here https://github.com/Microsoft/msbuild/blob/a972ec96c3920705e4e8d03d7ac8b6c3328450bd/src/Tasks/Microsoft.Common.CurrentVersion.targets#L3321-L3326

When this project is built via MSBuild the call to MarkupCompilePass1 happens long before GenerateTargetFrameworkMonikerAttribute is called:

image

When this project is invoked in Visual Studio the GenerateTargetFrameworkMonikerAttribute is invoked which adds this file to the list of items to compile. Because this list no longer matches with the cache created by MarkupCompilePass1 within MSBuild a rebuild occurs.

image

@rainersigwald Who should own this? Is this MSBuild or is this Roslyn?

I am trying to find a way around this short of modifying the Microsoft.Common.CurrentVersion.targets for all our developers.

aolszowka commented 5 years ago

For anyone following along at home; this work around works; but is required to be inserted into all WPF Projects:

  <PropertyGroup>
    <!--This is a hack to try and fix https://github.com/Microsoft/msbuild/issues/4217-->
    <MarkupCompilePass1DependsOn>$(MarkupCompilePass1DependsOn);GenerateTargetFrameworkMonikerAttribute</MarkupCompilePass1DependsOn>
  </PropertyGroup>
danmoseley commented 5 years ago

Kudos for that investigation @aolszowka!

KirillOsenkov commented 4 years ago

I am investigating incremental builds of WPF projects and hitting this issue too. In my case the GeneratedInternalTypeHelper.g.cs is sometimes cleared by MarkupCompilePass2: InternalTypeHelper class is not required for this project, make file 'C:\XXX\obj\Debug\YYY\net472\GeneratedInternalTypeHelper.g.cs' empty.

And sometimes it isn't cleared and has contents (when coming from MarkupCompilePass1).

To me the logic that clears the file is a bug - we should be always generating the type even if it isn't used, to avoid flip-flopping and breaking incrementality. I still don't understand why it decides that the file is not needed sometimes, but not always.

@aolszowka amazing and deep investigation, great job! To debug tasks in the MSBuild Structured Log Viewer, right-click a task (such as MarkupCompilePass1) and click Debug. It will start a TaskRunner.exe process and run the task inside that process with the same arguments from the binlog. It will display the JIT Debugger dialog to give you a chance to attach before the task runs.

@vatsan-madhavan as people are tuning their builds and WPF is resurging in popularity, it would be nice to get to the bottom of this at some point. Who would be the best person most knowledgeable about MarkupCompilePass1 and 2?

KirillOsenkov commented 4 years ago

@aolszowka also set the COMPLUS_ZapDisable=1 environment variable before you debug, to disable optimizations. Note that you need to start the debuggee under debugger and with this variable set, attaching to a running process that wasn't started under debugger won't disable the optimizations.

aolszowka commented 4 years ago

@KirillOsenkov Thank you for the tips, and thank you for MSBuild Structured Log Viewer, that tool is invaluable to us. We've noticed that on our larger builds (~3500 projects) the tooling seems to choke (launching is difficult, filtering just about impossible), but I have not taken the time to get a good bug report written up yet. Keep up the good work!

Make sure you're not getting burned by the other WPF bugs I reported as well; this is our current work arounds:

  <Target Name="GenerateCompiledExpressionsTempFile">
    <!--This is a no-op to overwrite the existing target that ships with MSBuild. When we upgrade to Visual Studio 2019 (MSBuild 16.0) we can remove this. See https://github.com/Microsoft/msbuild/issues/1648-->
  </Target>
  <PropertyGroup>
    <!--This is a hack to try and fix https://github.com/Microsoft/msbuild/issues/4217-->
    <MarkupCompilePass1DependsOn>$(MarkupCompilePass1DependsOn);GenerateTargetFrameworkMonikerAttribute</MarkupCompilePass1DependsOn>
  </PropertyGroup>

I believe I have the minimal test case here https://github.com/microsoft/msbuild/issues/4228 that was closed in lieu of a VS Feedback issue that has received zero visible traction; we are still willing to assist as required.

KirillOsenkov commented 4 years ago

Cool, I've filed https://github.com/dotnet/wpf/issues/2281 and also asked Livar to reactivate the other bug where you have the repro.

Also for incrementality, have you played with ProduceReferenceAssembly property? Setting it to true will generate a reference assembly into bin\Debug\ref for each project, and other projects would reference the binary from the ref folder. This is super useful when you just change a method body or fix whitespace or make any other edit that doesn't change the public API of the assembly. For such non-invasive edits the dependent projects will not rebuild.

The bug we're having here breaks that, since it keeps adding and removing the GeneratedInternalTypeHelper type to the compilation, thus invalidating the ref assemblies.

Binlog viewer does unfortunately have scaling issues and they are not easy to address. MSBuild just logs too much. There are bugs that we could fix to alleviate the problem, such as https://github.com/microsoft/msbuild/issues/4956. Or I could invest in some sophisticated indexing algorithms to speed up search and filtering. Unfortunately I doubt we can do anything about the initial speed of reading a large binlog - there's just too much data there, and I can't see a way to make some of it lazy/on-demand/deferred.