dotnet / csharplang

The official repo for the design of the C# programming language
11.52k stars 1.03k forks source link

Next big C# language should focus on supporting building distributed system and concurrency programming #502

Closed asydneylover closed 4 years ago

asydneylover commented 7 years ago

The next version of C# language should be focus on making it easier in building distributed system and concurrency programming instead of introducing more syntax sugar. That's the only way make C# really a big programming language in comparing with the others such as Java. One of the feature I think is a native support to implement Golang's co-routines.

CyrusNajmabadi commented 6 years ago

@CyrusNajmabadi That's not "super trivial" at all in Go.

Seems super trivial to me. We use it all the time throughout our codebase. It's a core part of our design to be able to simply have channels of events that we then have code that processes whenever those events fire.

It is simply something i'm not sure how you'd write effectively in the first place on top of IEnumerables...

(I mean... maybe it's possible... but i'm just not understanding how...)

masonwheeler commented 6 years ago

@CyrusNajmabadi If I had multiple concurrent input sources, I'd model them as IEnumerable<Task<T>> and implement something based on the principles found in this tutorial.

CyrusNajmabadi commented 6 years ago

But i will block my thread just trying to pull a single item off that ienumerable. What if i don't have tasks to yield?

masonwheeler commented 6 years ago

It always immediately yields a Task, which waits for data to be available. If you reach the end of the data stream, the Task gets cancelled.

CyrusNajmabadi commented 6 years ago

I'm not really understanding the argument anyways, Channels behave fundamentally differently from these other platform options. What if i'm in a domain that models more closely to that approach than the other approaches. Why is it bad that i now have a platform provided type to help me out. If i don't need it, i won't use it. If it's helpful though... now i have something i can use.

CyrusNajmabadi commented 6 years ago

It always immediately yields a Task, which waits for data to be available.

Sorry, i'm not getting how this works. Again, i have multiple streams of data coming in. How are you modeling those streams? How does this task work in your example? How do you model being able to get the data off of any of them in a non-blocking fashion?

masonwheeler commented 6 years ago

[Summing up a long discussion on Gitter as to the details of this system]

Again, i have multiple streams of data coming in. How are you modeling those streams?

I'm deliberately leaving the implementation details of building the tasks abstract, because I don't know the specifics of the data model you're using.

How does this task work in your example?

In the appropriate manner for the data involved. Again, I'm deliberately keeping this example abstract because the implementation details vary and it doesn't particularly matter; for any data source in .NET which will eventually produce a result, it's trivial to wrap it in a Task one way or another. If nothing more convenient is available, you can always just fall back on TaskCompletionSource.

How do you model being able to get the data off of any of them in a non-blocking fashion?

How are you currently doing it? Basically like that, but wrapped in a Task.

omidkrad commented 6 years ago

C# can do everything that GoLang does plus it gives more control on the code, but the greater control comes at the cost of having to know more constructs and a more verbose code. I think what we really need is some good syntactic sugar for coroutines and channels to make them first class citizens in the language. For example, we could have the <- operator to shortcut to await c.Writer.WriteAsync(x), or some keyword such as co (or even go) so we don't have to manually "async all the way".

omidkrad commented 6 years ago

If async operator operloading was supported, I think we could simply overload some operator like << to emulate <- of Go. And for the co/go keyword, maybe we can overload some unary operator like ~ to wrap Task.Run(() => ...) for us. I know I'm just throwing ideas! :)

svick commented 6 years ago

@omidkrad

Wouldn't await (c << x); be good enough? That's something you could do today with normal operator overloading. EDIT: Turns out it's not possible to do this.

C# intentionally makes every await clearly visible, so I don't think async operator overloading is going to be added to the language.

omidkrad commented 6 years ago

Sure await (c << x) will do it 👍

Using operator overloading we should also be able to make a short syntax for running taks in parallel (like the go keyword). For example, could be something like these:

~ Say("hello");
go | Say("Hello");
go >> Say("Hello");
masonwheeler commented 6 years ago

For example, we could have the <- operator

No, we really couldn't, because that already has a well-defined meaning that it's quite realistic to assume will be used in production code: "less than negative".

CyrusNajmabadi commented 6 years ago

Most of my 'go's are around 'funcs'. So the equivalent in C# (which seems fine with me) is just:

using static Task;
//...
Run(() => {
    // all the work
});

Which is basically the same as:

go func() {
    // all the work
}()

Note: these have identical character counts, and I'm not sure i see there as being any real need to make this much better.

omidkrad commented 6 years ago

Yes, I have to agree it's not much improvement. This one is even more characters:

go >> () => {
    // all the work
};

This one is a little better:

~() => {
    // all the work
};

but I really like this:

await (c << x);

It would be great to have this come with the Channels API.

HaloFour commented 6 years ago

@omidkrad

Per the C# spec the second operand of an overloaded bitshift operator must be an int.

masonwheeler commented 6 years ago

@HaloFour Probably to prevent exactly this sort of abuse of operator overloads. (See: C++ streams)

CyrusNajmabadi commented 6 years ago

This one is a little better:

Saying "Run" seems fine to me. I don't see any real value in trying to condense that down any further to a specific character. I mean, it's not like 'go' avoids saying the word 'go' itself. When there's already a very clean and easy way to do things, i don't think there's tremendous value in going overboard on syntactic brevity.

omidkrad commented 6 years ago

I agree. I'm drawing my suggestion! :)

masonwheeler commented 6 years ago

When there's already a very clean and easy way to do things, i don't think there's tremendous value in going overboard on syntactic brevity.

Agreed. 2 mch brvt mks thgs hrd 2 nrstnd!

MI3Guy commented 6 years ago

@HaloFour Challenge accepted.

https://gist.github.com/MI3Guy/aa8491634410beabbdb6bd042a2ca647

CyrusNajmabadi commented 6 years ago

That's hilarious.

masonwheeler commented 6 years ago

Wow, that's kind of horrifying!

yaakov-h commented 6 years ago

“kind of”??

scalablecory commented 6 years ago

I love language wars as much as the next guy, but at what point do we agree that a topic has become an exercise in trolling and bikeshedding and close it...

CyrusNajmabadi commented 6 years ago

@scalablecory Sometimes it is good to have a honeypot.

omidkrad commented 6 years ago

Next big C# language should focus on supporting building distributed system and concurrency programming

So the conclusion is: "No, C# already has async/await, use relevant libraries to do complex scenarios"?

Please vote up/down.

scalablecory commented 6 years ago

@omidkrad that's not how this works. we vote on specific proposals, not vague unsubstantiated musings.

omidkrad commented 6 years ago

Microsoft once had CCR/DSS toolkit for programming concurrent/distributed software services but it was not easy to use for the average developer. With async/await, C# has come a long way from there but I still believe it should be a lot easier to create concurrent and distributed software. If distributed software is more natively supported in the language/framework that would set the standard for the developers to follow, just as async/await does for asynchronous code.

FrankSzendzielarz commented 5 years ago

What actually is the suggestion in this thread? If there is one, is it any different from what Rx.NET already offers?

gafter commented 5 years ago

@FrankSzendzielarz There isn't a specific suggestion, just a suggestion of what kind of thing should be suggested.

masonwheeler commented 5 years ago

a suggestion of what kind of thing should be suggested.

Getting a little meta, are we? 😛

HaloFour commented 4 years ago

@ZiadUber

That's exactly what Java/JVM is doing now with Project Loom. Seems they have it figured out.

Indeed they are, and it'll be an interesting experiment. I think the Java world may be a little more insulated from this given the vast majority of the ecosystem does go through the JRE so theoretically the majority of places where a thread might block can be updated to properly support virtual threading. All third party native code will have to be updated. Blocking the underlying OS thread will always be possible and potentially severely reduce the concurrency of the virtual threads, especially since by default they all share the same common fork/join pool. And I'll be really curious if they'll have a mechanism to detect deadlocking like Go has, probably not. What will be particularly interesting is that Java mixes OS threads and virtual threads, and use of the latter must be deliberate.

I did a LabWeek project with the early access bits of Loom last month and it was a lot of fun. I particularly enjoyed the continuation primitives which allow you to write generators/async without language modifications. And I was pretty impressed with how far virtual threads have been implemented so far. I even used a SynchronousQueue<T> to pass data between multiple virtual threads backed by a single threaded executor, which was kind of mind blowing (well, not if you're a Go programmer, I guess). Virtual threads seem to cost about 2.5k each in heap space vs. 1 MB of stack space by default and I was able to kick off almost 200,000 of them before triggering a sigsegv and crashing the runtime.

Some Loom experiments

ZiadUber commented 4 years ago

@HaloFour

Thank you for the reply. Those experiments are quite interesting! I haven't experimented with a Loom JVM branch yet, so I was not aware that you could write generators without the overhead of spawning a separate fiber, as what you'd do in go.

Would you mind sharing the rest of the code in the sandbox package (e.g. sandbox.Generators.createGenerator and sandbox.Async.await)? Cheers.

HaloFour commented 4 years ago

@ZiadUber

Here are those two files. Being a LabWeek project they're a little sloppy, the goal being just to get them to work enough to demonstrate the concepts. Enjoy!

Generators.java Async.java

Thaina commented 4 years ago

What difference between Loom and Reactive.Linq though?

HaloFour commented 4 years ago

@Thaina

What difference between Loom and Reactive.Linq though?

Loom is an implementation of delimited continuations in the Java runtime. That allows you to capture the execution context and stack of a thread into a variable (the functional interface Runnable specifically) and then to resume it at some arbitrary point in the future on any thread. From those building blocks you can construct coroutines and green threads (what Loom calls virtual threads). From the former you can add C#-like features like iterators or async/await without requiring special compiler support. The latter combined with a compatible ecosystem lets you write entirely blocking imperative code without actually blocking any underlying threads.

The argument for a model like Loom or Go is that the only real reason for async APIs is to avoid blocking threads, and the only reason to avoid blocking threads is that threads are expensive. If threads were very cheap then there is much less of a reason to have async APIs or async-specific language features. Java put out another early access build in July and they've stabilized the implementation quite a bit. I was able to spin up 2 million virtual threads and "block" them all on a shared lock with a single backing thread and all within 2 GB of memory. 2 million real threads in both Java and C# would require 2 TB of memory, even if they were all blocked and not doing anything.

The tricky bit is that "compatible ecosystem" as anything under the hood that blocks needs to be modified to understand how to yield the virtual thread and queue up completion to resume that virtual thread on some other backing thread. Accidentally blocking real threads can starve all of the virtual threads from being scheduled. Java is taking a little bit of a gamble that they can manage all of the points where blocking can occur from within the runtime and that third-part native implementations of I/O libraries that could block can be expected to be updated.

333fred commented 4 years ago

There's been a lot of good conversations on this thread. However, as there's no real language proposal here, I'm going to close this out. Feel free to continue using it for discussion if you want to.