dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.07k stars 4.69k forks source link

[System.Linq] Consider supporting `Enumerable.TryGetNonEnumeratedCount` in `Enumerable.Select(..., Func<T, int, TResult>)` and `Enumerable.Index` #102252

Open Emik03 opened 4 months ago

Emik03 commented 4 months ago

Consider the following code:

var buffer = new int[4096];
var index = 0;

Console.WriteLine(buffer.Index().TryGetNonEnumeratedCount(out _)); // False
Console.WriteLine(buffer.Select((x, i) => (i, x)).TryGetNonEnumeratedCount(out _)); // False
Console.WriteLine(buffer.Select(x => (index++, x)).TryGetNonEnumeratedCount(out _)); // True

All three enumerables produce identical output, yet it is only the last one that is responsible for printing True, the rest print False.

Looking at the source code, this is to be expected, since the implementations of Select(..., Func<T, int, TResult>) and Index use iterator methods.

My proposal is to create dedicated Iterator classes internally just as Select(..., Func<T, TResult>) does. I'm very willing to write that myself and contribute a PR for that, if this suggestion is approved.

The main advantage with this change comes from the fact that Select(..., Func<T, int, TResult>) is a very common operation — hence why Index has gotten a dedicated method in the first place — and allowing both to be countable without enumeration would result in a good amount of optimizations both within System.Linq (subsequent calls to Concat, ElementAt, Take,ToDictionary, and Reverse) and from any other library that makes use of TryGetNonEnumeratedCount.

Potential drawbacks include:

dotnet-policy-service[bot] commented 4 months ago

Tagging subscribers to this area: @dotnet/area-system-linq See info in area-owners.md if you want to be subscribed.

eiriktsarpalis commented 4 months ago

This would require converting the cited implementations to use iterators. It seems doable to me, but out of size concerns we've been reserving these for methods we deem to be high impact.

Emik03 commented 4 months ago

Would a compromise of just a single iterator class be acceptable then?

eiriktsarpalis commented 4 months ago

Presumably that would involve encoding Index on top of Select? That might improve Index().TryGetNonEnumeratedCount() but the hidden delegate invocation feels like it might regress more common use cases. We'd be happy to consider a PR, provided that it's accompanied with comprehensive benchmarking.

Emik03 commented 4 months ago

I was thinking about that, yeah. I'll get to it some time later today then, and post a PR with a link back to this issue once I find the best balance. Thanks!

colejohnson66 commented 4 months ago

Is size really a concern now that trimming and NativeAOT exist (or, will with .NET 9)?