dotnet / csharplang

The official repo for the design of the C# programming language
11.3k stars 1.02k forks source link

[Proposal] Allow foreach for IEnumerator/IAsyncEnumerator #3867

Closed Kir-Antipov closed 3 years ago

Kir-Antipov commented 3 years ago

For the sake of simplicity, I'll only talk about IEnumerator, but this issue also applies to the IEnumerator<T> and IAsyncEnumerator<T>

The problem

Let's imagine that we need to treat the first element of the enumeration differently than the rest. How can we do this today?

1. Condition inside the foreach-loop:

bool isFirst = true;
foreach (var x in enumeration)
{
    if (isFirst)
    {
        isFirst = false;
        DoSomeWork(x);
    }
    else
    {
        DoAnotherWork(x);
    }
}

Pros:

Cons:

2. Reevaluating an enumeration:

DoSomeWork(enumeration.First());
foreach (var x in enumeration.Skip(1))
    DoAnotherWork(x);

Pros:

Cons:

3. Extension method that wraps IEnumerator in new IEnumerable

public static IEnumerable<T> ToEnumerable<T>(this IEnumerator<T> enumerator)
{
    _ = enumerator ?? throw new ArgumentNullException(nameof(enumerator));

    while (enumerator.MoveNext())
        yield return enumerator.Current;
}

...

using IEnumerator<object> e = enumeration.GetEnumerator();
DoSomeWork(e.ToEnumerable().First());
foreach (var x in e.ToEnumerable())
   DoAnotherWork(x);

Pros:

Cons:

confused-jackie-chan

4. Hello darkness boilerplate, my old friend:

IEnumerator e = enumeration.GetEnumerator();
try
{
    if (!e.MoveNext())
        throw new InvalidOperationException()
    DoSomeWork(e.Current);
    while (e.MoveNext())
        DoAnotherWork(e.Current);
}
finally
{
    if (e is IDisposable disposable)
        disposable.Dispose();
}

Pros:

Cons:

Possible solution

foreach needs an IEnumerator to work, right? So why not let him work with it directly and without intermediaries? This would allow us to do things like this:

using IEnumerator<int> e = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }.Cast<int>().GetEnumerator();

// .TakeNext - LINQ-like extension method, that returns new enumerator,
// which can be implemented as a structure and be completely allocation-free 
foreach (int x in e.TakeNext(1))
    Console.WriteLine("First block:  {0}", x);

foreach (int x in e.TakeNext(2))
    Console.WriteLine("Second block: {0}", x);

foreach (int x in e.TakeNext(3))
    Console.WriteLine("Third block:  {0}", x);

foreach (int x in e)
    Console.WriteLine("Last block:   {0}", x);

// First block:  1
// Second block: 2
// Second block: 3
// Third block:  4
// Third block:  5
// Third block:  6
// Last block:   7
// Last block:   8
// Last block:   9
// Last block:   10
HaloFour commented 3 years ago

foreach needs an IEnumerator to work, right?

No, foreach requires GetEnumerator that returns a type that fits the enumerator pattern. You aren't required to implement the interfaces, just surface compatible methods. You can already write an extension method for IEnumerator<T> today that does exactly what you're requesting.

Kir-Antipov commented 3 years ago

You can already write an extension method for IEnumerator today that does exactly what you're requesting

@HaloFour, yes indeed. And it'is completely different from what I'm talking about in 3

333fred commented 3 years ago

No, that is not what you talk about in 3. HaloFour means that, with C# 9, you can write this extension method:

public IEnumerator<T> GetEnumerator<T>(this IEnumerator<T> t) => t;

and it will just work :tm:

HaloFour commented 3 years ago

@Kir-Antipov

Unnecessary allocation

No allocations necessary, enumerables/enumerators can be structs and do not need to implement the interface.

It's not built-in feature, so we must reinvent this from project to project

This is an uncommon use case and can be solved with a library method with little/no overhead and very little code. There is very little reason to consider this as a language change. You're not even proposing a language change, you're proposing a new API, one that you can write today in a dozen or so lines of code. If it's that useful, publish it to NuGet. I wouldn't be shocked if there is an implementation up there already.

Really strange logic: turn an enumerator into an enumerable so that foreach can get an enumerator from it...

If it's strange then why are you asking to do it? Turning an enumerator back into an enumerable is exactly what you're asking for here.

SharpLab

Kir-Antipov commented 3 years ago

No, that is not what you talk about in 3. HaloFour means that, with C# 9, you can write this extension method

@333fred, yes it is, it's not that different. We still have these cons:

CyrusNajmabadi commented 3 years ago

It's not built-in feature, so we must reinvent this from project to project

Extremely uncommon things work that way. You can just create a simple nuget lib for this one and reference it from an the projects you need.

Kir-Antipov commented 3 years ago

No allocations necessary

@HaloFour, yep, that's true, but the simplest way to achieve this will cause allocation. How often do developers you know (including you) choose the best way instead of the simplest and most obvious?

This is an uncommon use

I disagree. It's very common use: do thing0 with that part of enumeration, do thing1 with this part of enumeration etc

Turning an enumerator back into an enumerable is exactly what you're asking for here

Meh. Yes, but no. Into an enumerable, but not into IEnumerable (or any other object that supports GetEnumerator)

CyrusNajmabadi commented 3 years ago

Really strange logic:

It doesn't seem strange to me... It sounds like exactly what you're asking for. The ability to foreach over it. That's how the language foreachs. You provide this pattern and it works. That's why we designed y it this was tbh.

333fred commented 3 years ago

Let's imagine that we need to treat the first element of the enumeration differently than the rest.

I also think you're missing probably the simplest version of this (and with Span<T>, it's even no copy!):

DoWork(enumeration[0]);
foreach (var element in enumeration[1..])
{
    DoAnotherWork(element);
}
Kir-Antipov commented 3 years ago

@333fred, good luck to compile it:

IEnumerable<int> enumeration = Enumerable.Range(0, 10);
DoWork(enumeration[0]);
foreach (var element in enumeration[1..])
{
    DoAnotherWork(element);
}

:)

Kir-Antipov commented 3 years ago

Extremely uncommon things work that way

@CyrusNajmabadi, I don't think so. I need this in literally every real-world project...

You can just create a simple nuget lib for this one and reference it from an the projects you need

I'm already upset enough that I've to include System.Linq.Async as a separate package in each project. We're moving in some strange direction: we have cool mechanisms, but in order to work with them into that cool way we need to connect 20 different packages from NuGet ¯\(ツ)/¯ Will I live to see the day that we have to reference System.DateTime from NuGet?

alrz commented 3 years ago

Will I live to see the day that we have to reference System.DateTime from NuGet?

You already do . If DateTime doesn't meet your needs, you should probably use NodaTime.

Kir-Antipov commented 3 years ago

@alrz, hah, true :)

Kir-Antipov commented 3 years ago

Okay, will change my example to System.Decimal. You need it, but it's not in common use. So let's move it to separate NuGet

HaloFour commented 3 years ago

Okay, will change my example to System.Decimal.

I take it you don't work in the financial sector. That type is very commonly used.

Kir-Antipov commented 3 years ago

@HaloFour, note that I've italicized the word "common" :)

I'm just talking about the fact that, ok, maybe you bypass the need for this functionality, but this doesn't mean that it ain't important for others. I use decimal more often than double or float, but I know a lot of people that never used it at all. Does it mean we need to move decimal to the separate package for "guys from the financial sector"?

HaloFour commented 3 years ago

@Kir-Antipov

Nothing will be removed. That will break existing projects.

There was a bit of cleaning house in the move from .NET Framework to .NET Core as .NET Framework accreted a lot of types over the years. But even that had to be measured against how many projects it would block from migrating. Going forward the runtime teams are trying to avoid building everything into a single image and are less likely to accept API proposals that can be easily provided by an outside/NuGet library.

CyrusNajmabadi commented 3 years ago

we need to connect 20 different packages from NuGet

And...? :)

That seems like it's working exactly as intended. Not all functionality is core functionality. When something is trivially solvable by a one line library function that is trivial to distribute, there's no need for a language change. :)

Kir-Antipov commented 3 years ago

Nothing will be removed. That will break existing projects.

@HaloFour, yep, I know that perfectly.C# team agreed to apply the breaking change only once in my memory. It was a question from the category: "Let's imagine, that it won't. Would you?"

Kir-Antipov commented 3 years ago

something is trivially solvable by a one line library function

@CyrusNajmabadi, one word, 3 TiB, int.MaxValue references and ulong.MaxValue vulnerabilities: node_modules

CyrusNajmabadi commented 3 years ago

So let's move it to separate NuGet

If we were on 1.0 i would likely support that. But we're not :) The dsicussion is not about moving exisiting apis to nuget packages, but instead how new language features aren't really needed if they are trivially provided by a nuget package.

CyrusNajmabadi commented 3 years ago

@CyrusNajmabadi, one word, 3 TiB, int.MaxValue references and ulong.MaxValue vulnerabilities: node_modules

I don't know what any of that means :)

What is 3TiB in reference to? What do you mean by "int.MaxValue references?" What would having this API be in a nuget package have to do with vulnerabilities? I really don't understand what you're trying to say there. Can you clarify? Thanks!

CyrusNajmabadi commented 3 years ago

I need this in literally every real-world project...

Common for you is not the litmus. And, again, even if this were common for everyone, it's trivially solved for everyone with a simple function or nuget package. So why update the language when the appropriate mechanism for this sort of thing is alreayd available and would work well here?

Kir-Antipov commented 3 years ago

If we were on 1.0 i would likely support that. But we're not :)

@CyrusNajmabadi

Understandable

Kir-Antipov commented 3 years ago

@CyrusNajmabadi, ok, I've got your views about this kind of subjects :)

Kir-Antipov commented 3 years ago

I don't know what any of that means :)

@CyrusNajmabadi

I never liked node_modules and these are the reasons:

And it's kinda sad, that our ecosystem slowly turns into something node_modules like.

HaloFour commented 3 years ago

@Kir-Antipov

And it's kinda sad, that our ecosystem slowly turns into something node_modules like.

A healthy developer ecosystem is one where contributions are available from a lot of different sources. Microsoft shouldn't be expected to ship every bit of functionality out of the default libraries. Apart from it being a completely untenable strategy that will result in a mono-framework that is gigs in size it also precludes the useful contributions from others. In my opinion MS already ships too many frameworks and components. I honestly think that Java has a larger and more vibrant ecosystem because a lot of larger organizations came together to fill in the many gaps left by Sun and Oracle.

Kir-Antipov commented 3 years ago

@HaloFour

Let's start with the basics. There're two polar stages of ecosystems, both of which are completely unhealthy:

I am not expecting .NET be "mono-framework" or "no-framework" either, cause they're equally terrible.

For a long time .NET was perfectly balanced.

.NET was the healthiest ecosystem I've ever know, and it's one of many things I love about it. For the first time I felt a great disturbance in the force was when we got IAsyncEnumerable, but the Linq methods for it - did not. And since then, every project I have has a reference to System.Linq.Async, and every output build comes out 1 MB heavier.

As I understood from CyrusNajmabadi's words, we're moving to that side where our projects will be filled with hundreds of the references. And it disappoints me

HaloFour commented 3 years ago

@Kir-Antipov

and every output build comes out 1 MB heavier.

It would have always been 1 MB heavier. The question is whether or not you've be able to choose to be 1 MB lighter in those projects where you don't need System.Linq.Async. Being a separate package makes that possible. Being embedded in the runtime does not.

CyrusNajmabadi commented 3 years ago

it's about node_modules size. A lot of developers implement and reimplement similar auxiliary functionality for their libraries,

So contribute PRs that reduce that and help the ecosystem solidify on canonical impls.

there're a lot of "one one-line function" libraries

Ok?

I don't get the arugment? Should all one liners then get packaged into the core framework? You're just trading off one thing for another. Effectively an enormous class lib that attempts to pack in everything everyone needs for every situation.

Who creates these packages?

People who feel like there is value in the ecosystem having this function.

Who maintains them?

Those same people.

When functionality is excessively fragmented into different packages, this only leads to an increase in the number of vulnerabilities among them

When functionality is consolidated, now the team has to spend all their time just maintaining millions of single line functions.

You are stating we should take on this cost along with everything else, despite it being niche, and despite it being trivial for you to write and maintain yourself. Now consider we have millions of customers. all of them would love it if we did this for their helpers that they find value in. Us taking on thousands or millions of those cases is a burden we cannot handle. However, the ecosystem and community is vast and capable. By spreading this cost out to all of you, it actually can work.

CyrusNajmabadi commented 3 years ago

@Kir-Antipov Please keep the memes out. They're not helpful for this discussion.

CyrusNajmabadi commented 3 years ago

For a long time .NET was

It really wasn't. It was a split you found good for your needs. But that doesn't apply to everyone. There is always some group that wants their pet functionality in the core framework. And every group will always find that some of what they want doesn't apply to the ecosystem as a whole. the only maintainable approach is to have the core lib actually be subtantively valuable to the larger ecosystem as a whole, and have niche functionality be provided through third party libs. What you're finding now is that this is a case of such niche functionality, and you are now on hte side that wants that.

CyrusNajmabadi commented 3 years ago

As I understood from CyrusNajmabadi's words, we're moving to that side where our projects will be filled with hundreds of the references. And it disappoints me

If your projects have hundreds of niche needs, then yes, that is where you would end up. However, such projects sound like they would be exceedingly niche themselves. The vast majority of projects are not like that. They generally get most of their functionality from teh core lib, and then get the rest from ecosystem libs.

Kir-Antipov commented 3 years ago

Please keep the memes out. They're not helpful for this discussion.

@CyrusNajmabadi, I've already closed issue and discussion about the feature. And I just like speaking in memes, don't see anything criminal about it ¯\(ツ)/¯ But I'll keep memes out while talking directly to you, if you wish so

Kir-Antipov commented 3 years ago

It would have always been 1 MB heavier. The question is whether or not you've be able to choose to be 1 MB lighter in those projects where you don't need System.Linq.Async. Being a separate package makes that possible. Being embedded in the runtime does not.

@HaloFour, erm, not. Well, I'm user. I downloaded 100 apps that use System.Linq.Async. It gives me +100 MB of their total weight. If System.Linq.Async was embedded in the runtime it would be hundred times less.

CyrusNajmabadi commented 3 years ago

And I just like speaking in memes, don't see anything criminal about it ¯(ツ)/¯

Understood. But it's really not helpful or contributing to this discussion. :)

But I'll keep memes out while talking directly to you, if you wish so

A good way to think aobut this is that this is not a general fun-forum for random chats. It's an actual professional environment where people (including team members like myself) are engaging as part of our jobs and the work we do . Just as these memes wouldn't be appropriate as part of general emails on work topics, they're not appropriate here. So if you can please keep them to a minimum (or just elide entirely), that would be appreciated. Thanks! :)

CyrusNajmabadi commented 3 years ago

It gives me +100 MB of their total weight.

I'm not seeing an actual problem here. You're describing this as if this is actually something which needs a solution. But from my perspective there's nothing that needs to be done there. If many different projects want to reference their own particular sets of different package references... what is the issue? Why is that a bad thing?

Furthermore, again this is all happening in the context of a request to change the language. The nuget ecosystem working as intended and as expected isn't in any way a motivating case for me to now say "ok... everyone who has a one-off api need now can get that change into the language" :)

Kir-Antipov commented 3 years ago

So contribute PRs that reduce that and help the ecosystem solidify on canonical impls

You're talking about some kind of utopia where everyone does everything right, or help those who did it wrong though. As a resident of a country that has spent almost a century figuring out whether utopias work, I can responsibly declare that they do not

Ok?

It's not okay. If so many people have a need for a similar one-line functionality for built-in types that is needed in everyday use, then it must of course be represented by the built-in framework.

Should all one liners then get packaged into the core framework?

Of course they do not. Some of them should, some of them should not. E.g., packages like MoreLinq should be presented as separate libraries, but it would be a real disaster if Linq should be referenced through NuGet.

When functionality is consolidated, now the team has to spend all their time just maintaining millions of single line functions.

As I said before, there's no need to include everything into the core framework. It's just wrong. And yes, maintaining core framework is difficult task, but we trust core framework developers and there's really not so many chances for them to leave a vulnerability in the code due their skills and professionalism.

You are stating we should take on this cost along with everything else, despite it being niche, and despite it being trivial for you to write and maintain yourself. Now consider we have millions of customers. all of them would love it if we did this for their helpers that they find value in. Us taking on thousands or millions of those cases is a burden we cannot handle. However, the ecosystem and community is vast and capable. By spreading this cost out to all of you, it actually can work.

No, I'm not stating it xD It's great that developers can create their own libraries, put the helpers they use into packages and share them. But some things need to be built into the core framework. Somewhere there is a line that separates one from the other, and I'm honestly not ready to somehow formalize where exactly it is :) Cause of that I may seem to you (a fan of the idea of ​​splitting everything into pluggable packages, as I understood) as someone who wants to cram everything into one place. But this isn't the case. I'm somewhere in between these two ideas :)

CyrusNajmabadi commented 3 years ago

It's not okay. If so many people have a need for a similar one-line functionality for built-in types that is needed in everyday use,

But not so many people have this need.

then it must of course be represented by the built-in framework.

It really does not. Again, things need to be substantially valuable to make that bar. Just like they need to be subtantially valuable to make it into a language feature. A niche lang feature that is trivially solved with a one line niche API helper is absolutely not that. :)

CyrusNajmabadi commented 3 years ago

but it would be a real disaster if Linq should be referenced through NuGet.

I honestly don't see why. There are tons of useful, hugely used bits of functionality that come through nuget. You seem to have a presupposed position that that is not ok.

If i were doing Linq again today, I would likely fully support it, and all the Linq variants (like XLinq, etc.) come through nuget. It would make a ton of sense.

CyrusNajmabadi commented 3 years ago

Somewhere there is a line that separates one from the other, and I'm honestly not ready to somehow formalize where exactly it is :)

It's pretty simple: does this substantively improve things for the larger community out there. The runtime team makes this determination all the time and it doesn't seem particularly hard to gauge :)

I'm somewhere in between these two ideas :)

Sure. IN this case though, the decision seems trivially clear. This is def so niche that it should be external. It def should not be in the language. I can say that part at least for sure :)

Kir-Antipov commented 3 years ago

pet functionality in the core framework

As I said before, this kind of things is bad. But System.Linq.Async is not "pet functionality", cause we have built-in IAsyncEnumerable, but we don't have built-in tools for it. And it's strange. I'd even be ok with the idea of the whole IAsyncEnumerable thing as a NuGet package, but this separation looks kinda strange: it's built-in, but it's not at the same time

Kir-Antipov commented 3 years ago

A good way to think aobut this is that this is not a general fun-forum for random chats. It's an actual professional environment where people (including team members like myself) are engaging as part of our jobs and the work we do . Just as these memes wouldn't be appropriate as part of general emails on work topics, they're not appropriate here. So if you can please keep them to a minimum (or just elide entirely), that would be appreciated

I understood that, but personally, I don’t think that the workplace should always be something serious. Sometimes it can be "defused" with a joke or a meme. This is my opinion, but I respect those who don't think so and don't bother them with such things :)

Kir-Antipov commented 3 years ago

I'm not seeing an actual problem here. You're describing this as if this is actually something which needs a solution. But from my perspective there's nothing that needs to be done there. If many different projects want to reference their own particular sets of different package references... what is the issue? Why is that a bad thing?

And I didn't get why it's not a bad thing xD Not everyone has few TB on their laptops, e.g. :) One day I had 128 GB in total, and something like this would be a great problem for me as a user.

CyrusNajmabadi commented 3 years ago

One day I had 128 GB in total, and something like this would be a great problem for me as a user.

Sorry. That's just not a use case we optimize the language for. If your development environment cannot handle package sources, then that's not going to lead us to just put these in the language.

To be as clear as possible: it's not htat this argument carries little weight. It's explicitly that catering the language to this sort of case is a non-goal. :)

Kir-Antipov commented 3 years ago

If i were doing Linq again today, I would likely fully support it, and all the Linq variants (like XLinq, etc.) come through nuget. It would make a ton of sense

@CyrusNajmabadi, just what I'm talking about. It's great for developers but it's not so great for users. Imagine that everyone ships their apps with the whole .NET (referenced part by part through NuGets) onboard. I really want to understand your point of view and why it's not bad

If your development environment cannot handle package sources, then that's not going to lead us to just put these in the language

I'm talking not about development, I'm talking about using that comes out of the development :)

CyrusNajmabadi commented 3 years ago

Imagine that everyone ships their apps with the whole .NET (referenced part by part through NuGets) onboard.

Yes... i can imagine that. It makes a whole lot of sense to me. After all, i would want any particular app to be updatable independently from all other apps. So having the runtime and deps be versionable and isolated in this fashion seems like a high positive for me :)

I really want to understand your point of view and why it's not bad

Well, what would be bad about this approach?

I'm talking about using that comes out of the development :)

I would expect this approahc to, in general, decrease the size of any deployment due to not including lots of stuff you don't need, and only including the stuff you do.

CyrusNajmabadi commented 3 years ago

Note: at this point i think the discusison is better served in a venue like gitter.im/csharplang, or discord.gg/csharp .

spydacarnage commented 3 years ago

I remember a time when every application installed in Windows had a collection of shared DLLs - each one tried to install it and if they had a more up-to-date copy, it would overwrite the previous one, potentially causing other applications to break. And woe betide any time you tried to uninstall something that was shared...

These days, these same applications have their own set of DLLs, not relying on this shared space. It involved a hit on installation sizes, on users' hard drives, but I would never want to go back to how it was.

Each application having its own set of NuGet packages is just an extension to that, and it allows multiple applications to use multiple versions of the same NuGet package without any overriding or breaking of other applications.

Kir-Antipov commented 3 years ago

Well, what would be bad about this approach?

@CyrusNajmabadi, application's size. If everyone ships there apps with the same libraries, users will spent in total much more space than if there was a place where all these binaries are gathered together. No? Or am I missing something?

Note: at this point i think the discusison is better served in a venue like gitter.im/csharplang, or discord.gg/csharp .

Will keep in mind, but it's really last thing I want to know, so I don't think there's need to switch for the different platform for one message xD