Add String.Format overloads to avoid unnecessary allocations

justinvp commented 9 years ago

It's pretty common to pass value types to String.Format, unfortunately this results in unnecessary boxing allocations.

Rationale

Consider the following:

long id = ...;
int index = ...;
int length = ...;

// Traditional call to String.Format
string foo = string.Format("{0}: Index = {1}, Length = {2}", id, index, length);

// Use of the string interpolation language feature
string bar = $"{id}: Index = {index}, Length = {length}";

Both the traditional call to String.Format and the use of the string interpolation language feature (which is just syntactic sugar for String.Format) requires 3 boxing allocations.

Proposed API

namespace System
{
    public sealed class String : ...
    {
        // Proposed methods
        public static string Format<T0>(string format, T0 arg0);
        public static string Format<T0, T1>(string format, T0 arg0, T1 arg1);
        public static string Format<T0, T1, T2>(string format, T0 arg0, T1 arg1, T2 arg2);
        public static string Format<T0>(IFormatProvider provider, string format, T0 arg0);
        public static string Format<T0, T1>(IFormatProvider provider, string format, T0 arg0, T1 arg1);
        public static string Format<T0, T1, T2>(IFormatProvider provider, string format, T0 arg0, T1 arg1, T2 arg2);

        // Existing methods
        public static string Format(string format, object arg0);
        public static string Format(string format, object arg0, object arg1);
        public static string Format(string format, object arg0, object arg1, object arg2);
        public static string Format(string format, params object[] args);
        public static string Format(IFormatProvider provider, string format, object arg0);
        public static string Format(IFormatProvider provider, string format, object arg0, object arg1);
        public static string Format(IFormatProvider provider, string format, object arg0, object arg1, object arg2);
        public static string Format(IFormatProvider provider, string format, params object[] args);

        ...
    }
}

namespace System.Text
{
    public sealed class StringBuilder
    {
        // Proposed methods
        public StringBuilder AppendFormat<T0>(string format, T0 arg0);
        public StringBuilder AppendFormat<T0, T1>(string format, T0 arg0, T1 arg1);
        public StringBuilder AppendFormat<T0, T1, T2>(string format, T0 arg0, T1 arg1, T2 arg2);
        public StringBuilder AppendFormat<T0>(IFormatProvider provider, string format, T0 arg0);
        public StringBuilder AppendFormat<T0, T1>(IFormatProvider provider, string format, T0 arg0, T1 arg1);
        public StringBuilder AppendFormat<T0, T1, T2>(IFormatProvider provider, string format, T0 arg0, T1 arg1, T2 arg2);

        // Existing methods
        public StringBuilder AppendFormat(string format, object arg0);
        public StringBuilder AppendFormat(string format, object arg0, object arg1);
        public StringBuilder AppendFormat(string format, object arg0, object arg1, object arg2);
        public StringBuilder AppendFormat(string format, params object[] args);
        public StringBuilder AppendFormat(IFormatProvider provider, string format, object arg0);
        public StringBuilder AppendFormat(IFormatProvider provider, string format, object arg0, object arg1);
        public StringBuilder AppendFormat(IFormatProvider provider, string format, object arg0, object arg1, object arg2);
        public StringBuilder AppendFormat(IFormatProvider provider, string format, params object[] args);

        ...
    }
}

namespace System.Runtime.CompilerServices
{
    public static class FormattableStringFactory
    {
        // Proposed methods
        public static FormattableString Create<T0>(string format, T0 argument0);
        public static FormattableString Create<T0, T1>(string format, T0 argument0, T1 argument1);
        public static FormattableString Create<T0, T1, T2>(string format, T0 argument0, T1 argument1, T2 argument2);

        // Existing method
        public static FormattableString Create(string format, params object[] arguments);
    }
}

I'd be happy to contribute an implementation and tests.

ellismg commented 9 years ago

Do the additional FormattableStringFactory members require Roslyn changes to be consumed with the nice syntax?

svick commented 9 years ago

I think the parameters in the new overloads should have exactly the same names as in the old overloads, so that if someone called string.Format() with named parameters, their code also starts using the new overloads just by recompiling.

justinvp commented 9 years ago

Do the additional FormattableStringFactory members require Roslyn changes to be consumed with the nice syntax?

I need to double-check, but I'm pretty sure Roslyn is already designed to make use of such overloads to String.Format and FormattableStringFactory.Create, if they exist. So, no Roslyn changes would be required.

justinvp commented 9 years ago

I think the parameters in the new overloads should have exactly the same names as in the old overloads

The 0-based parameter names seem a little strange, but now that you mention keeping the naming the same as the old overloads, the format items in the format string are 0-based, so it makes sense that the parameter names match that. In that case, I'm guessing we'd want the naming of the generic type parameters to match the naming of the arguments:

So instead of this:

public static string Format<T>(string format, T arg);
public static string Format<T1, T2>(string format, T1 arg1, T2 arg2);
public static string Format<T1, T2, T3>(string format, T1 arg1, T2 arg2, T3 arg3);

This:

public static string Format<T0>(string format, T0 arg0);
public static string Format<T0, T1>(string format, T0 arg0, T1 arg1);
public static string Format<T0, T1, T2>(string format, T0 arg0, T1 arg1, T2 arg2);

Or maybe even TArgument0, TArgument1, and TArgument2 vs. T0, T1, and T2.

weshaggard commented 9 years ago

We did talk about the generic overloads on FormattableStringFactory as part of the review but we didn't believe we had enough runway to get the compiler to actually use those in this product release. (cc @gafter).

As for the string.Format overloads we did end up doing some work this release to make the arg0/arg1/arg2 overloads actually avoid the params array allocation internally (yeah until recently we didn't actually even optimize that even though we had the overloads). While we did do that one thing we realized is that string.Format is extremely bad about allocations internally and we don't believe that even eliminating the params array allocation ended up improving performance as much as we would have liked it to. I'm primarily saying that because I have to wonder if a few boxing allocations is really going to have much of an impact when what is really needed is a much larger redoing of the underlying formatting code (which I'm not sure is even possible in a compatible way). Do you have any data about whether or not eliminating these boxing allocations will end up helping much? By the way this doesn't mean these aren't worth adding I'm just trying to collect some more data.

gafter commented 9 years ago

The C# and VB compilers will use appropriate overloads on FormattableStringFactory if those exist. What we didn't have enough runway for was writing those overloads.

pharring commented 9 years ago

My experience is that the inner workings of string.Format (actually StringBuilder.AppendFormat) is quite efficient. The allocations are what you'd expect. Most of the time we get a hit in StringBuilderCache, so there's no char[] allocation. Allocations in this area usually show up in serialization and logging routines. Serialization usually arrives via TextWriter.Write (which calls string.Format). Removing the params array in 4.6 was nice. Roslyn analyzers can help locate and eliminate boxing.

It would be interesting to see if one could implement this, avoiding boxing but at the same time avoiding stamping out 3 new generic versions of StringBuilder.AppendFormat.

mattwarren commented 9 years ago

Roslyn analyzers can help locate and eliminate boxing.

There's already an analyser that does this. The tricky part is keeping it up to date with the changes, for instance it currently doesn't handle the recent changes that got rid of unnecessary boxing in String.Concat. Probably the same applies for the change that removed params array allocations, that has been mentioned elsewhere in this thread.

The robust way to do this would be for it to look at the IL produced, but that's going to be expensive for an Analyser (although I did give it a go and Roslyn combined with Cecil makes it nice and easy).

justinvp commented 9 years ago

Do you have any data about whether or not eliminating these boxing allocations will end up helping much?

@weshaggard, I have some data in the conclusion below, but first some background...

Consider the following example:

DateTime now = DateTime.UtcNow;
Guid id = Guid.NewGuid();
decimal value = 3.50m;

string result = string.Format("{0:s} - {1:B}: The value is: {2:C2}", now, id, value);

Let's track of the allocations with the current implementation:

The now parameter is boxed.
The id parameter is boxed.
The value parameter is boxed.
A new StringBuilder is allocated to build-up the "s" (in "{0:s}") format string.
The actual "s" string is allocated.
The resulting string for now.ToString("s", null) is allocated.
A new StringBuilder is allocated to build-up the "B" (in "{1:B}") format string.
The actual "B" string is allocated.
The resulting string for id.ToString("B", null) is allocated.
A new StringBuilder is allocated to build-up the "C2" (in "{2:C2}") format string.
The actual "C2" string is allocated.
The resulting string for value.ToString("C2", null) is allocated.
The overall resulting string is allocated.

13 allocations!

Note that there's also a StringBuilder instance used for the overall resulting string that I left out of the allocation count as this allocation is mitigated by the use of the internal StringBuilderCache.

These allocations can be reduced significantly!

A) Generic overloads would reduce this by 3, down to 10 allocations.
B) There is some low-hanging fruit inside the String.Format (StringBuilder.AppendFormat) implementation that would avoid the StringBuilder allocations for the argument format strings, using string.Substring for the common case and falling back to a single StringBuilder instance only when necessary (more on this below), which would reduce the allocations by another 3, down to 7 allocations.
C) Using an alternative design to IFormattable (even if just used internally for the built-in BCL types) could reduce this by another 6 allocations, down to just 1 allocation!

Down to just 1 allocation for the resulting string!

A) Generic overloads

Avoids the boxing allocations. Pretty straightforward.

B) Low-hanging fruit (avoiding/mitigating the internal `StringBuilder` allocations)

Right now with the current implementation, a new StringBuilder is allocated for each argument that has a format string associated with it (e.g. the "C2" in "{0:C2}"). This can be trivially improved by simply reusing the StringBuilder instance for multiple args. It involves just moving the StringBuilder variable to the top of the loop, still lazily allocating it, clearing it when finished, and reusing it for the next argument that needs it. However, even better, in the common case when there aren't any escaped braces in the argument's format string (e.g. "{{" and "}}") then a StringBuilder isn't needed at all; the more light-weight String.Substring can be used.

C) `IFormattable` alternative

Optimizing String.Format (StringBuilder.AppendFormat) is ultimately limited by the design of IFormattable, and to a lesser extent the less commonly used ICustomFormatter.

public interface IFormattable
{
    string ToString(string format, IFormatProvider formatProvider);
}

This design is perfectly reasonable, but there are a couple issues that prevent optimizations inside String.Format (StringBuilder.AppendFormat):

The format parameter is typed as string, which means an instance of string must be allocated for format, as opposed to accepting a StringSpan.
The result is a string, which means an instance of string must be allocated for the result, as opposed to appending the resulting characters to a destination StringBuilder.

An alternative design might look something like the following:

public interface IFormattableTo // (no pun intended, heh)
{
    void FormatTo(StringSpan format, IFormatProvider formatProvider, StringBuilder destination);
}

It may not be worth exposing such an interface publicly (at least for now) as it would basically be a replacement for IFormattable and it requires the StringSpan or Span<char> type, which hasn't landed yet. But it could be included as internal for now and implemented on all the built-in BCL types that currently implement IFormattable for improved String.Format perf for these very commonly used types.

MikePopoloski commented 9 years ago

Shameless plug of my own work: https://github.com/MikePopoloski/StringFormatter Might be useful to mine for ideas.

I too needed string formatting without allocations. It'd be nice to take advantage of some of that dedicated code built into the CLR to make it faster, but it does alright on its own as pure C#.

justinvp commented 9 years ago

@terrajobst, Immo, this is an API suggestion. Could this be added to the agenda for an API review? Even if there's some opposition to the String and StringBuilder overloads, I think the FormattableStringFactory overloads are a no-brainer (would have been added for 4.6, but ran out of time).

chintsan commented 9 years ago

Hi Justin, thanks for the explanation ans benchmarks.

how does one include the above overloads. i mean defining a new class "System" throws error - no body implementation for the overload methods or they arent defined as abstract, partial, extern etc.

NickCraver commented 8 years ago

I know it's been a while, but we'd love to see this over at Stack Overflow - @terrajobst can you please ping here when the API review is up next so we can listen in?

terrajobst commented 8 years ago

@justinvp

Could this be added to the agenda for an API review?

Generally yes, I believe with have everything we need. However, we're currently focusing on getting RTM ready, which means we're focusing our reviews on exposing APIs that already exist in .NET Framework to simplify porting as well as adding new APIs that are required to unblock other components and tools. That's why it's currently in the Future milestone. I hope that we'll be able to move some of these to a post-RTM milestone.

AdamSpeight2008 commented 8 years ago

We could get the maximum .ToString() of most of core types. Eg .MaxValue.ToString.Length + 1

ApproxMaxCharLength<T>( obj : T ) : Int
{ 
  return match( typeof( T ) ) with
  {
    | string -> ((string)obj).Length; 
    |   bool ->  5; //  false
    |   byte ->  3; //  256
    |  sbyte ->  4; // -128
    |   char ->  1; // 
    | uint16 ->  6; //  65535
    |  int16 ->  6; // -32768
    | uint32 -> 11; // -2147483648 
    |  int32 -> 10; //  4294967295
    | uint64 -> 20; //  18446744073709551615
    |  int64 -> 20; // –9223372036854775808
    |    ... ->  0;
    // not done the floating point ones.
  }
}

An initial first approximation of the string length, could be calculated thus.


var intialSize = args.Sum( ApproxMaxCharLength ) + TextOfFormatString.Length;

I think this would be an over allocation. Note this is excluding any variation caused by the culture and additional alignment and format specifiers on the arg hole eg { x , -16 : X4 }

At which point (not done any deep analysis) would the extra processing to get a good estimate of the size of string to allocate, outweighs the processing cost of just blindly doing the building of the string?

gafter commented 8 years ago

@AdamSpeight2008 This issue is mainly about the allocations due to boxing.

terrajobst commented 8 years ago

The APIs look fine, however, the only way we could benefit from this work is if our implementation avoids the boxing all the way down, which would be non-trivial amount of work. Also, it's questionable without a benchmark how big the savings are, because formatting will probably cause a bunch of allocations anyway.

It might be better to leave these APIs as-is and instead use the new formatting APIs that @KrzysztofCwalina is working on.

@KrzysztofCwalina, what are your thoughts on this?

NickCraver commented 8 years ago

Remember that the benefits here are extreme because just by recompiling, you get them. That huge, free adoption should be factored into the effort costing. This helps all code, rather than just new code. I don't see string.Format() or $"" going anywhere for a long time.

Perhaps I'm missing something though - doesn't $"" continue to map to string.Format in future plans as well, or is that being re-pointed to some new API? That'd be a pretty extreme breaking change, but worth asking...

gafter commented 8 years ago

$"..." always maps to an invocation of string.Format (if the target type is string).

KrzysztofCwalina commented 8 years ago

I like the improvement in-place for the reson @NickCraver spells out. I think we should add these APIs. The only concern I would have is that it might cause Generics instantiation explosion that runtime people are really concerned about lately. @jkotas?

BTW, the new formatting APIs @terrajobst mentioned are described at https://github.com/dotnet/corefxlab/wiki/System.Text.Formatting. But I think an in-place addition will be complementary.

jkotas commented 8 years ago

This should be data-driven decision. It is unclear to me whether saving extra allocations is worth it to justify the extra code in this case. String.Format will always allocate and it will always be relatively slow API. It is unclear to me whether changing it to allocate a bit less garbage at the cost of pile of extra code is going to measurable improve anything important.

benaadams commented 8 years ago

The trouble is string interpolation is too good a feature not to use, even for those of use that normally try to avoid allocations at all costs... (though normally used in lesser taken paths like exception handling)

jkotas commented 8 years ago

Agree. In the above @justinvp example, there 13 allocations. 9 of them are internal. It should be possible to eliminate all internal ones without changing the shape of the API, so that should be done first.

karelz commented 8 years ago

Next step: Improve current memory allocations of existing APIs as much as possible -- the result will tell us if we still need new APIs.

NickCraver commented 7 years ago

How would we ever eliminate the boxing here without the API additions? I agree with internal eliminations being the first step, but if we're after absolute performance, I don't see a way around the second step.

jkotas commented 7 years ago

we're after absolute performance, I don't see a way around the second step.

The no-allocation formatters being developed in corefxlab are going after absolute performance.

String.Format will always allocate the string. The string is likely short-lived garbage - it is unlikely that it is stored for long. I do not think that the difference between allocating a string vs. allocating a string and a box for each argument is going to be that significant.

JonHanna commented 7 years ago

(Aside regarding the original proposed API rather than the current discussion, would public static string Format<T>(string format, params T[] args); be useful or overkill? It's not rare to have to output a bunch of integers or a bunch of dates or otherwise format several items of the same type).

NickCraver commented 7 years ago

@jkotas I agree on String.Format(), but our much more common use case (localization in views, etc.) is on the StringBuilder.AppendFormat() side of the fence, which has more to win here overall. An example of this is in ASP.NET views, especially when we're doing localization.

Either way, we need to measure. The trouble is measuring this accurately means really doing it for comparison...to see if it's worth doing. Vicious circle we have here :)

jkotas commented 7 years ago

The trouble is measuring this accurately means really doing it for comparison

Right, and doing it is non-trivial amount of work because of the current formatting implementation has heavy dependency on boxing.

george-polevoy commented 5 years ago

The new Format(Span<char> span, out int length) family of functions is what makes the fastest implementation possible, using zero alloc techniques.

Rewrite $"prefix{x}postfix"

to

var resultBuffer = stackalloc new char[<just enough for the value types in question> plus the string constant];
var buffer = resultBuffer;
int length;
"prefix".CopyTo(buffer);
buffer = buffer.Slice("prefix".Length);
arg1.Format(buffer, out length);
buffer = buffer.Slice(length);
"postfix".CopyTo(buffer);
resultBuffer.Overlaps(buffer, out var elementOffset);
return new string(resultBuffer.Slice(0, elementOffset));

I understand that this would totally break compatibility, so this could be an extension to a future C# version, like:

$ <interceptor expression> "{x}"

The idea is to specify a storage for the resulting output. Anything that can be possibly used to store a string result would be a good fit for the interceptor.

$ stackalloc char[] "{x}"
$ stackalloc byte[] "{x}" // for utf8

// ...
$ new StringBuilder(100) "{x}"
$ existingStringBuilder "{x}"

Another possibility to avoid boxing would be to call ToString() on each of the arguments in-place, and also provide an overload which takes a span of strings instead of array of objects.

joperezr commented 4 years ago

Not sure why is this one flagged for 5.0, @jkotas @terrajobst I suppose that this one we probably want to push to future instead right?

davidfowl commented 3 years ago

Did we ever make progress on this? Did we remove the 9 internal allocations here? I was looking around at issues for improving string.Format and this one keeps popping up in my searches. I'm also wondering why there isn't a method that "compiles" the template (sorta like a compiled regex):

FormattableString<string> formatter =  FormattableString.Create<string>("Hello {name}");

formatter.Format("Joe");
formatter.Format("John");
formatter.Format("Paul");

terrajobst commented 3 years ago

@stephentoub can correct me, but AFAIK we solved a related problem in the BCL a while ago to take advantage of specialized params overloads for a fixed number of arguments to avoid allocating the array. Not sure we made any progress for avoiding allocations due to boxing of arguments. If we do this, we likely want to support FormattableString. Depending on how Roslyn binds to the factory, this might or not might not need compiler changes.

I'm also wondering why there isn't a method that "compiles" the template (sorta like a compiled regex)

I don't think the use case ever came up. For 99% of cases, callers have the format as a string and the arguments. Not saying it's wrong, but it's certainly not very convenient.

stephentoub commented 3 years ago

What I proposed (https://github.com/dotnet/runtime/issues/28945), and what @jaredpar's been running with as a potential C# 10 feature (https://github.com/dotnet/csharplang/issues/2302), is the idea of a target-typable custom builder, with the C# compiler generating Append calls for each string segment and each hole. For example, if you had a method:

static void Foo(MyCustomBuilder builder);

and someone called that with:

ReadOnlySpan<char> name = ...;
int age = ...;
Foo($"Hi.  My name is {name}, and my age is {age}.");

the compiler would translate that into something along the lines of:

var b = MyCustomBuilder.Create();
b.Append("Hi. My name is ");
b.Append(name);
b.Append(", and my age is ");
b.Append(age);
b.Append(".");
Foo(b);

MyCustomBuilder would at a minimum expose an Append overload for object, but it could also expose overloads for for string, for int, for ReadOnlySpan<char>, for T, and for whatever other types it wants to optimize for.

geeknoid commented 3 years ago

I have a proposed for a considerably faster solution to string formatting, which avoids boxing of arguments in the common case and only performs a single allocation for the final string. It's coming at around 2x faster than String.Format or string interpolation in the common case, plus since it doesn't create temporary garbage it's nicer on the GC in the long run.

See https://github.com/geeknoid/FastFormatting.

If folks are interested, I'd be happy to submit a PR and get this integrated in the system. The code that's there now is mostly feature complete, but it lacks full testing and docs. I'll address these shortcomings in the next few days.

PS: @stephentoub, you might recognize the concept from CompositeFormatString in that previous project we worked on years ago :-)

stephentoub commented 3 years ago

@geeknoid, what are the advantages of that over my cited proposal? From a usability perspective, I'd prefer being able to write Foo($"Hi. My name is {name}, and my age is {age}.") rather than private static readonly StringFormatter s_nameFormat = new StringFormatter("Hi. My name is {0}, and my age is {1}."); ...; Foo(s_nameFormat.Format(name, age));, and I don't believe the proposed StringFormatter works well with spans, which can't be used as generic arguments. The cited proposal also similarly avoids all boxing (when the builder exposes an Append<T> or an Append method specific to the type), only allocates for the final string (if a string is even needed... a consumer could just access a span from the builder), etc., plus it not only precomputes the holes, but does so at compile time.

davidfowl commented 3 years ago

How do format specifiers work in your proposal @stephentoub ?

stephentoub commented 3 years ago

How do format specifiers work in your proposal @stephentoub ?

We'd define a pattern for Append methods, where the first argument is the thing to be formatted, if there's a second it's a format string, etc.

geeknoid commented 3 years ago

I've now added support for a Span-based TryFormat version which results in no allocations. This version is over 4x as fast as String.Format and induces no garbage. After 20 years, there's finally a formatting function in .NET that can compare in perf with C's sprintf :-)

Here are the benchmark results

|                      Method |     Mean |    Error |   StdDev |     Gen 0 |     Gen 1 |    Gen 2 |  Allocated |
|---------------------------- |---------:|---------:|---------:|----------:|----------:|---------:|-----------:|
|     TestClassicStringFormat | 46.02 ms | 0.865 ms | 1.515 ms | 2909.0909 | 1181.8182 | 363.6364 | 16800057 B |
|           TestInterpolation | 47.91 ms | 0.943 ms | 1.523 ms | 3000.0000 | 1250.0000 | 416.6667 | 16800523 B |
|         TestStringFormatter | 25.59 ms | 0.508 ms | 0.903 ms | 2187.5000 |  906.2500 | 281.2500 | 12000025 B |
| TestStringFormatterWithSpan | 10.86 ms | 0.182 ms | 0.224 ms |         - |         - |        - |        4 B |

geeknoid commented 3 years ago

@stephentoub The benefits to my proposal is that:

It doesn't require any significant new patterns across types, it leverages IFormattable and ISpanFormattable
I believe it is both smaller IL and faster than what you propose.
The implementation exposes a span-based version which completely eliminates all allocations.
It's fully implemented and can be played around with.
It works with IFormatProvider like String.Format

I think the C# compiler can be taught about this type and magically generate the static StringFormatter instances for any use of string interpolation. Thus, existing code would just suddenly get faster by recompiling. Additionally, I think there could be a Roslyn code fixer which offers to convert any use of String.Format with a static format string into an interpolated string. Those cases would also then magically benefit from improved perf.

Also I think my implementation for a non-boxing version of the ParamsArray type could be used to eliminate boxing in existing APIs like String.Format, making those a touch faster too.

En3Tho commented 3 years ago

@geeknoid can you share benchmark code or link it please? While this is kinda off topic but I feel that there might be a way to make faster F# printf with things you're proposing.

danmoseley commented 3 years ago

@En3Tho looks like it's at https://github.com/geeknoid/FastFormatting

stephentoub commented 3 years ago

The benefits to my proposal is that

Thanks.

I believe it is both smaller IL and faster than what you propose.

I expect it would be smaller IL. I'm skeptical it'd be meaningfully faster for creating strings. And it would almost certainly result in more asm, in particular as every unique combination of arguments involving a value type would result in generic specialization for all of the generic pieces, and that's likely to be non-trivial; in contrast, with the builder approach, you only end up worst case with a generic-specialization per type formatted rather than per combination of types formatted.

The implementation exposes a span-based version which completely eliminates all allocations.

I agree the TryFormat aspect of it is nice. With the builder approach, you can get that as well, but it involves a copy.

In both approaches, it's also hard to claim all allocations are completely eliminated, as the moment you need to utilize IFormattable, you're likely incurring string allocations for each argument being formatted, only not when the instance is able to utilize a cached string.

I also don't see how the proposed approach supports arbitrary numbers of arguments without allocation. At some point you need one more argument than can be supported by the explicit strongly-typed overloads provided, and you end up with the params object[] overload. One of the nice things about the builder approach is it supports any number of arguments strongly-typed, as each argument is formatted independently with a single Append call.

It doesn't require any significant new patterns across types

This also hints at one of its biggest flaws IMO, but maybe you have a good solution for this. I think anything new we do in this space must support formatting ReadOnlySpan<char> as arguments. One of the nice things about the builder approach is this is easily supported just by having an Append(ReadOnlySpan<char>) overload on the builder. I don't see how this is easily supported in your proposal, as spans can't currently be used as generic arguments. How would you propose to support span being used as one or more of the arguments?

Also I think my implementation for a non-boxing version of the ParamsArray type could be used to eliminate boxing in existing APIs like String.Format, making those a touch faster too.

With the existing overloads? I don't see how, since the boxing is inherent to the existing APIs. It's also inherent to the existing implementation, which relies heavily on being able to pass the arguments around as object. If you're suggesting new generic string.Format overloads like those proposed in the very first post, that would also require plumbing that through the whole formatting implementation; I'd be interested in understanding what that implementation would look like, and similar to my previous comments, what impact that would have on bloating asm size in a typical apps (especially for AOT compilations, Blazor, Xamarin apps, etc., where presumably existing usage would start binding to the new more-strongly-typed overloads).

geeknoid commented 3 years ago

Yes, you've found all the flaws of the current implementation :-). Some more thoughts:

It's only allocation-free up to 3 arguments. This could be extended to a larger number, but there's clearly an upper bound. With the way the API is designed, the first 3 arguments are captured in a strongly-typed fashion and the rest are in a params object?[]
I made a specific point of keeping the code that has multiple generics as simple as possible. There is actually very little code in that regime, the generic code calls out ASAP to non-generic functions when possible. I learned my lesson about the impact of multi-generic types when we tried to implement a 4-parameter finger tree years ago. It was enormous! Anyway, I think the amount of code generated is within reason.
Of course, if the implementation calls out to an IFormattable instance, an allocation will happen as a result. There's no getting around that given how IFormattable works. The code deals with all primitive types without boxing however. It would be good to make ISpanFormattable public so more types could implement it and improve formatting efficiency in general.
My design works, today, in any version of .NET that supports Span without requiring any changes in the system. I think that makes it pretty compelling to adopt immediately given the substantial perf improvement it delivers.
Don't underestimate the value of the 2x perf boost you get with TryFormat. This is huge. We have a lot of services that do a lot of string formatting and enabling them to be 0 alloc in the common case would likely lead to measurable COGs savings.

I can't think of a way to support ReadOnlySpan in my current design, just like it can't be supported by any of the existing formatting code in .NET. That doesn't make me too sad though. The solution there is to fix the GC to allow Span in the heap :-)

geeknoid commented 3 years ago

@stephentoub So what I've done in my stuff is three things:

Extend ValueStringBuilder to support Append methods for the various primitive types. These overloads support argument formatting.
Create the machinery to preparse a composite format strings to a dense form that can be interpreted faster during formatting and provide a formatting loop that leverages this.
Implement a hack to prevent the params array allocation and boxing for the first 3 formatting arguments.

Exposing the extended ValueStringBuilder as a public type would be sufficient to implement the builder pattern you describe. String interpolation would immediately benefit. This could support all the normal formatting features (width, justification, etc)

Exposing ValueStringBuilder would make it possible for developers to implement 0-alloc string concatenation by supplying a span to the constructor of ValueStringBuilder. It's a bit clumsy, but it'll be quite fast.

To let developers do 0 alloc string formatting, then you can add the StringFormatter type with its TryFormat method. Or, you could get fancy and enhance string interpolation with new syntax such as:

Span<char> target = new Span<char>(new char[256]);
target <= $"Hello {name}"

This would let the C# compiler pass the span to the ValueStringBuilder constructor and voila, slick no-alloc string formatting.

En3Tho commented 3 years ago

That syntax is too vague in this case. Should it return bool and chars written, should it only return chars written and -1 in bad case, or should it completely hide details and just suddenly blow up with our of memory if anything goes wrong? I think that sticking to Method calls is better here. Also, I think the problem here (as whole) is that you need a compiler support to make code as efficient as possible and the way of the builder seems better as it can be easily integrated to support any number of parameters and so on. But at the same time I can't really understand how builder will solve explicit Format calls. Like for example when you programmatically build format and then call the function. @geeknoid while I think that making a specialized builder calls is probably better in case of known constant format and known arguments, maybe your approach could help the second case?

geeknoid commented 3 years ago

@stephentoub I've updated the design at https://github.com/geeknoid/FastFormatting. From the README:

The StringMaker type is a supercharged version of the classic StringBuilder type. It is designed for efficiently building up a string or span by appending together bits and pieces. The implementation tries to avoid any memory allocations when it can, and if you supply your own span to the constructor, it can operate entirely without allocating memory. When used in this memory, totally performance is around 4x faster than using StringBuilder.
The StringFormatter type is built on top of StringMaker and is designed to support the normal .NET composite formatting model, such as you use with String.Format. The type splits the normal formatting step in two in order to boost performance of the formatting process. The first step is parsing the composite format string into an efficient form. This step is done by create a StringFormatter instance. Once you have this instance, you can use it to format a set of arguments by calling the Format method, which is about 2x as fast as String.Format. You can also use the TryFormat method to format directly into your own span, which is 4x faster than String.Format.

This new code maintains the same perf characteristics as I had previously. The code has nearly 100% code coverage and compatibility with classic String.Format semantics is explicitly tested for.

I think StringMaker can be used directly by the C# compiler to implement efficient interpolation with a single allocation for the final string. Some magic C# syntax can also enable 0 allocation by outputting to an existing span. The StringFormatter type on its side is useful as a faster replacement of String.Format for when interpolation is not an option (for localized text for example).

How's that sound?

geeknoid commented 3 years ago

Latest perf tests. These are all trying to format the same set of inputs and produce the same final string:

|                      Method |      Mean |     Error |    StdDev |     Gen 0 |     Gen 1 |    Gen 2 |  Allocated |
|---------------------------- |----------:|----------:|----------:|----------:|----------:|---------:|-----------:|
|     TestClassicStringFormat | 44.866 ms | 0.8916 ms | 1.4897 ms | 3000.0000 | 1250.0000 | 416.6667 | 16800539 B |
|           TestInterpolation | 45.272 ms | 0.8490 ms | 1.5524 ms | 3000.0000 | 1250.0000 | 416.6667 | 16800515 B |
|           TestStringBuilder | 49.877 ms | 0.9750 ms | 1.3016 ms | 2888.8889 | 1111.1111 | 333.3333 | 16800055 B |
|         TestStringFormatter | 25.101 ms | 0.5015 ms | 0.9170 ms | 2187.5000 |  906.2500 | 281.2500 | 12000022 B |
| TestStringFormatterWithSpan |  9.574 ms | 0.1785 ms | 0.1490 ms |         - |         - |        - |        4 B |
|             TestStringMaker |  7.308 ms | 0.1203 ms | 0.1564 ms | 2859.3750 |         - |        - | 11999924 B |
|     TestStringMakerWithSpan |  5.043 ms | 0.0988 ms | 0.1832 ms |         - |         - |        - |        2 B |

stephentoub commented 3 years ago

I think StringMaker can be used directly by the C# compiler to implement efficient interpolation with a single allocation for the final string

Isn't StringMaker essentially just ValueStringBuilder, and basically the same builder approach I was proposing? The only thing that differs is I was suggesting a pattern in addition, but that's just to enable additional scenarios, namely an API that wants to take a particular builder so as to a) have raw access to its underlying buffer and b) have control over the Append methods exposed so as to have more control over the formatting. But we'd have a built-in implementation of it that the compiler could use when targeting string, such that existing string interpolation just go more efficient in its implementation without semantic changes. So, I'm not clear on fundamentally what's different here? (Note that if we do go with such an approach, such a type would likely live in the compiler services namespace, and would be discouraged for general direct use: in general we're wary of exposing structs that wrap pooled resources, even if ref structs, that would make it easy to accidentally copy them and dispose multiple times, double-freeing a buffer.)

The StringMaker type is a supercharged version of the classic StringBuilder type.

As noted above, this seems like it's just ValueStringBuilder. If from an implementation perspective there are things you've done in your implementation that you've found to be more efficient, please feel free to submit a PR to improve ValueStringBuilder's implementation / reduce its overheads.

How's that sound?

I think you're now proposing two separate things:

Basically what I proposed (waving hands at minor API-level differences here and there)
In addition a separate StringFormatter class, with the primary goals of a) supporting the existing .NET formatting model pre-interpolated strings but with as much precomputed as possible and with generic arguments and b) exposing a TryFormat.

Am I understanding correctly?

stephentoub commented 3 years ago

ps I'm on a stay-cation now and will be slow to respond until the new year. Thanks, and happy holidays.

dotnet / runtime