dotnet / msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
https://docs.microsoft.com/visualstudio/msbuild/msbuild
MIT License
5.24k stars 1.35k forks source link

Use Span-based path manipulation #6977

Open ladipro opened 3 years ago

ladipro commented 3 years ago

MSBuild still uses the old string-based path manipulation almost everywhere, which leads to avoidable allocations in cases where the result of e.g. Path.GetFileName() is not stored on the heap.

Here's one example use of Path.GetFileName() which could be easily converted to Span:

https://github.com/dotnet/msbuild/blob/356825cf62bb36ebd215572c8b7e7eabc88ca7fc/src/Build/BackEnd/Shared/BuildRequestConfiguration.cs#L293

but there are many more, for other Path APIs as well as for helpers implemented in the MSBuild codebase. Some of them are trivial to fix, some will require deeper changes.

Note that the new Span-based public surface is available in the Microsoft.IO.Redist package on .NET Framework under the namespace Microsoft.IO instead of System.IO.

stanciuadrian commented 3 years ago

Just to make sure I understand the change:

#if FEATURE_MSIOREDIST
   if (MemoryExtensions.Equals(Microsoft.IO.Path.GetFileName(ProjectFullPath.AsSpan()), "dirs.proj".AsSpan(), StringComparison.OrdinalIgnoreCase))
#else
   if (MemoryExtensions.Equals(Path.GetFileName(ProjectFullPath.AsSpan()), "dirs.proj", StringComparison.OrdinalIgnoreCase))
#endif
ladipro commented 3 years ago

Yes, this is the desired change. It would be interesting to see if we can switch entire source files to Microsoft.IO to reduce the number of ifdefs:

#if FEATURE_MSIOREDIST
using Path = Microsoft.IO.Path;
#else
using Path = System.IO.Path;
#endif
stanciuadrian commented 3 years ago

And I guess there is no AsSpan() for NETSTANDARD2_0.

It looks to me that a single PR would be way too large. What do you think about splitting the work into chunks of ~10 fixes that would be easier to review/approve/merge?

ladipro commented 3 years ago

Eventually we'd like to upgrade to 2.1, so there is hope (#6148). Until then NETSTANDARD2_0 has to be handled specially.

Splitting the work to multiple PRs is perfectly fine and, as you wrote, would make it easier to handle. Thank you!

Forgind commented 3 years ago

I'm wondering if it would be reasonable to make some helper method that takes, say, two strings and two functions and returns if they're equal after calling the functions on each of them. The idea then would be that if we later come up with something more efficient than Spanifying everything (hard to imagine in this case), it would be trivial to fix it everywhere at once.

stanciuadrian commented 3 years ago

Unfortunately there are a lot of scenarios.

I agree, it's a better choice in the long term. Thanks for the early feedback.

    static class IoUtil
    {
#if NETSTANDARD2_0
        public static bool Equals(string s1, string s2, Func<string, string> f1, Func<string, string> f2, StringComparison sc)
            => string.Equals(f1(s1), f2(s2), sc);
#else
        public delegate ReadOnlySpan<char> IoSpanFunc(ReadOnlySpan<char> input);

        public static bool Equals(string s1, string s2, IoSpanFunc f1, IoSpanFunc f2, StringComparison sc)
            => MemoryExtensions.Equals(f1(s1.AsSpan()), f2(s2.AsSpan()), sc);
#endif
    }

// called below
if (IoUtil.Equals(
    NodeProviderOutOfProcTaskHost.TaskHostNameForClr2TaskHost,
    msbuildLocation,
    Path.GetFileNameWithoutExtension,
    Path.GetFileNameWithoutExtension,
    StringComparison.OrdinalIgnoreCase
    ))

Do you think StringTools is a good host for this class?

Forgind commented 3 years ago

With that structure, you would either need to be able to switch between them at the call site as well or ensure that f1 and f2 work as either Func<string, string> or IOSpanFunc. (The latter is probably a good assumption but makes it a little less flexible.) I was thinking something like:

public static bool Equals(string s1, string s2, Func f1, Func f2, StringComparison sc)
{
#if NETSTANDARD2_0
    // Verify function parameter then Equals call with string
#else
    // Verify function parameter then Equals call with span
#endif
}

Maximum flexibility, though you can't demand compile-time precision with f1 and f2.

I believe StringTools is for things that we might want to share with others, so that sounds good to me. I'd let ladipro comment on the whole idea first, though; he normally finds any holes in my ideas 🙂

ladipro commented 3 years ago

Below I've attempted to enumerate a few aspects to help assess the proposal:

Based on this I would personally vote for not doing this and make the changes without introducing new helpers. Or at least without helpers taking callbacks. To handle the NETSTANDARD2_0 case, another options is implementing a simple version of AsSpan, MemoryExtensions.Equals, etc. for this target to make the code compile. Basically roll out a trivial implementation of Span.

stanciuadrian commented 3 years ago

This is from a commit I reverted due to failing tests (long story, we'll get back to it). It was my initial proposal to deal with NETSTANDARD2_0.

I could find the string.AsSpan() extension method but Path.GetFileName is missing the System.ReadOnlySpan<char> overload. Path is not partial and I couldn't get it to compile. Do you have an idea on how to fix it?

Also, is comparing System.ReadOnlySpan<char> lengths the same as comparing string lengths?

ladipro commented 3 years ago

I see, so if your intent is to implement the missing Path.GetFileName overload I would do something like this. Let me know if it's reasonable:

#if NETSTANDARD2_0
using Path = Microsoft.Build.Shared.CompatPath;
#elif FEATURE_MSIOREDIST
using Path = Microsoft.IO.Path;
#else
using Path = System.IO.Path;
#endif

Where CompatPath would have the missing implementation. It would also have to declare all other Path methods used by the code and forward them to Path.

Also, is comparing System.ReadOnlySpan<char> lengths the same as comparing string lengths?

Yes, pathToTool.AsSpan().Length is the same as pathToTool.Length.

Forgind commented 3 years ago

Path isn't sealed, right? So to keep it a little simpler, CompatPath could directly extend Path.

ladipro commented 3 years ago

It is declared as static in C# which translates to abstract sealed in IL so extending is not allowed unfortunately.