dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.99k stars 4.03k forks source link

Proposal: Support Async Pattern Methods (Cω Chords) #1165

Closed HaloFour closed 7 years ago

HaloFour commented 9 years ago

Background

In my opinion one of the more interesting ideas in the Polyphonic C# and Cω languages out of Microsoft Research was "asynchronous method chords". Combined with asynchronous methods this simple syntax provided one of the most elegant ways to express complex synchronization between two or more asynchronous operations.

Proposal

An asynchronous pattern is comprised of two parts.

Mailbox Methods

The first part is an asynchronous method which has no body and must return void. This method serves as a notification primitive. Invoking the method is the same as posting a message where the arguments are additional pieces of data transmitted with the message.

public async void Add(T value);
public async void Done();

When invoked a mailbox method effectively returns immediately to the caller and can never throw exceptions.

Pattern Methods

The second part is an asynchronous pattern method which may return void, Task or Task<T> and has a normal method body. This method is decorated with one or more & join operators with the name and signature of a mailbox method between the parameter list and the method body. The parameter names in the joined mailbox signature may be renamed. The values of those parameters are available in the body of the pattern method.

public async Task<T> Take()
    & Add(T value)
{
    return value;
}

After the pattern method body more join operations may appear using different permutations of the available mailbox methods. These permutations must be unique per each method body regardless of order. When the pattern method is invoked which body is executed depends on which mailbox methods are invoked.

public async Task<T> Task()
    & Add(T value)
{
    return value;
}
    & Done()
{
   throw new TaskCanceledException("No more items!");
}

What happens when the pattern method depends on whether there have already been corresponding calls to the mailbox methods. For a pattern method body to be executed there must be corresponding calls to every mailbox method on which it joins.

If there are no corresponding calls to the mailbox methods for any of the pattern methods then the method returns immediately. Once a mailbox method is called which satisfies the pattern method then that pattern method body will be executed on the same thread that invoked the mailbox method.

If the method returns Task or Task<T> the returned task is incomplete. If there are corresponding calls to the signal methods for a pattern method then the method is invoked on the current thread.

One the pattern method body is executing it behaves like a normal asynchronous method. It may use the await keyword to await on other asynchronous operations, including other pattern methods. It may return a value which sets the result on the returned Task or Task<T> instance, or it may throw an exception which sets the exception on the returned Task or Task<T> instance.

public class PatternQueue<T>
{
    public async void Add(T value);
    public async void Done();

    public async Task<T> Get()
        & Add(T value)
    {
        return value;
    }
        & Done()
    {
        throw new TaskCanceledException();
    }
}

static class Program
{
    static void Main()
    {
        var queue = new PatternQueue<int>();

        var task1 = queue.Get();
        Debug.Assert(task1.IsCompleted == false);

        queue.Add(5);
        Debug.Assert(task1.IsCompleted == true);
        Debug.Assert(task1.Result == 5);

        queue.Add(6);
        queue.Add(7);
        var task2 = queue.Get();
        Debug.Assert(task2.IsCompleted == true);
        Debug.Assert(task2.Result == 6);

        queue.Done();
        var task3 = queue.Get();
        Debug.Assert(task3.IsCompleted == true);
        Debug.Assert(task3.IsCanceled == false);
        Debug.Assert(task3.Result == 7);

        var task4 = queue.Get();
        Debug.Assert(task4.IsCanceled == true);
    }
}
HaloFour commented 9 years ago

Updated syntax based on the Cω syntax.

Here are the relevant docs:

Cω Overview Cω Concurrency Examples Tutorial Cω Concurrency Specification

Note that I am only using the Cω as a reference. The syntax that I proposed is different in that it builds on async/await from C# 5.0. From a functionality point of view the biggest difference is that the pattern method is not synchronous, as it is in Cω. If the pattern method has no corresponding signal method invocations it will not block but rather return an incomplete Task.

svick commented 9 years ago

This certainly looks like an interesting feature, but considering existing libraries (TPL Dataflow, Rx, AsyncEx) and that C# 7.0 is likely going to have pattern matching, I don't see a compelling benefit from this.

Do you have any use cases that show what makes this feature useful? I don't think your queue example is that (it's basically BufferBlock from TPL Dataflow).

HaloFour commented 9 years ago

I know that TPL DataFlow and Rx both include functionality around asynchronous patterns, but they can be quite cumbersome to consume especially when the same asynchronous result could be used in more than one flow. I don't see join primitives in AsyncEx although I'm not familiar with that library.

I don't believe that the pattern matching proposals for C# 7.0 touch on this form of functionality at all.

The example I did provide is overly simplistic. It doesn't touch on the more complicated situations that this feature would solve very nicely, where the same "signal" could be used by multiple pattern methods but only triggering one of them based on the confluence of other "signal" methods. I'll see if I can work up something that isn't purely esoteric, such as Santa's Workshop.

svick commented 9 years ago

I don't believe that the pattern matching proposals for C# 7.0 touch on this form of functionality at all.

What I meant is that you can have a library type similar to F# MailboxProcessor<T> (though not exactly, MailboxProcessor<T> runs continuously, not one message at a time) and then use a discriminated union as T and use switch to implement parts of your pattern methods.

HaloFour commented 9 years ago

That seems like a very different model in that you're waiting for specific messages to be posted to the mailbox and then processing them immediately. In the case of Cω it doesn't respond at all until all of the messages (signals, whatever) are received.

public async void Message1(int x);
public async void Message2(int y);
public async void Message3(int z);

public async Task<int> Wait1()
    & Message1(int x)
{
    return x;
}

public async Task<int> Wait2()
    & Message1(int x)
    & Message2(int y)
{
    return x + y;
}
    & Message3(int z)
{
    return z;
}

static void Main()
{
    Task<int> task1 = Wait1();
    Task<int> task2 = Wait2();
    Task<int> task3 = Wait3();

    // all three joined tasks still waiting
    Debug.Assert(task1.IsCompleted == false);
    Debug.Assert(task2.IsCompleted == false);
    Debug.Assert(task3.IsCompleted == false);

    // call to Message1 triggers join pattern from Wait1
    Message1(123);
    Debug.Assert(task1.IsCompleted == true);
    Debug.Assert(task1.Result == 123);
    Debug.Assert(task2.IsCompleted == false);
    Debug.Assert(task3.IsCompleted == false);

    // call to Message2 does not trigger join pattern from Wait2 as the previous call to Message1 was already handled
    Message2(456);
    Debug.Assert(task2.IsCompleted == false);
    Debug.Assert(task3.IsCompleted == false);

    // another call to Message1 triggers join pattern from first call to Wait2
    Message1(789);
    Debug.Assert(task2.IsCompleted == true);
    Debug.Assert(task2.Result == 1245);
    Debug.Assert(task3.IsCompleted == false);

    // another call to Message2 does not trigger join pattern from second call to Wait2, waiting on a corresponding call to Message1
    Message2(999);
    Debug.Assert(task3.IsCompleted == false);

    // call to Message3 triggers join pattern from second call to Wait2
    Message3(12345);
    Debug.Assert(task3.IsCompleted == true);
    Debug.Assert(task3.Result == 12345);
}
HaloFour commented 9 years ago

The TPL Dataflow non-greedy JoinBlock is probably what best corresponds to this proposal.

How to: Use JoinBlock to Read Data From Multiple Sources

Here's how that sample could be implemented:

public async void FileResourceAvailable(FileResource resource);
public async void NetworkResourceAvailable(NetworkResource resource);

public async void MemoryResourceAvailable(MemoryResource memory)
    & FileResourceAvailable(FileResource file)
{
    // Perform some action on the resources. 

    // Print a message.
    Console.WriteLine("Network worker: using resources...");

    // Simulate a lengthy operation that uses the resources.
    Thread.Sleep(new Random().Next(500, 2000));

    // Print a message.
    Console.WriteLine("Network worker: finished using resources...");

    // Release the resources back to their respective pools.
    FileResourceAvailable(file);
    MemoryResourceAvailable(memory);
}
    & NetworkResourceAvailable(NetworkResource network)
{
    // Perform some action on the resources. 

    // Print a message.
    Console.WriteLine("Network worker: using resources...");

    // Simulate a lengthy operation that uses the resources.
    Thread.Sleep(new Random().Next(500, 2000));

    // Print a message.
    Console.WriteLine("Network worker: finished using resources...");

    // Release the resources back to their respective pools.
    NetworkResourceAvailable(network);
    MemoryResourceAvailable(memory);
}

static void Main()
{
    NetworkResourceAvailable(new NetworkResource());
    NetworkResourceAvailable(new NetworkResource());
    NetworkResourceAvailable(new NetworkResource());

    MemoryResourceAvailable(new MemoryResource());

    FileResourceAvailable(new FileResource());
    FileResourceAvailable(new FileResource());
    FileResourceAvailable(new FileResource());

    Thread.Sleep(10000);
}

In my opinion the language semantics eliminates a lot of complicated boilerplate and makes it much easier to follow.

paulomorgado commented 9 years ago

I'm just not too keen to the syntax. I don't like those "multibodied methods".

Instead of declaring "unbodied methods" and "multibodied methods", have you considered declaring just trigger the conditions for each body?

public whatever MemoryResourceAvailable(MemoryResource memory)
    & FileResourceAvailable(FileResource file)
{
    // Perform some action on the resources. 

    // Print a message.
    Console.WriteLine("Network worker: using resources...");

    // Simulate a lengthy operation that uses the resources.
    Thread.Sleep(new Random().Next(500, 2000));

    // Print a message.
    Console.WriteLine("Network worker: finished using resources...");

    // Release the resources back to their respective pools.
    FileResourceAvailable(file);
    MemoryResourceAvailable(memory);
}

public whatever MemoryResourceAvailable(MemoryResource memory)
    & NetworkResourceAvailable(NetworkResource network)
{
    // Perform some action on the resources. 

    // Print a message.
    Console.WriteLine("Network worker: using resources...");

    // Simulate a lengthy operation that uses the resources.
    Thread.Sleep(new Random().Next(500, 2000));

    // Print a message.
    Console.WriteLine("Network worker: finished using resources...");

    // Release the resources back to their respective pools.
    NetworkResourceAvailable(network);
    MemoryResourceAvailable(memory);
}
HaloFour commented 9 years ago

@paulomorgado

That's possible. For the most part I just took the Cω syntax and adapted them to be asynchronous (in Cω they were blocking calls). While I agree that the two bodies for one method may seem a little confusing at first I think that it does better convey that they are one method. It also is a little more DRY in that you don't need to repeat that signature. Per that specification you did still have to repeat the other async method signatures. Six of one, half dozen of the other.

svick commented 9 years ago

Yeah, I was focusing on the case where you're waiting only on one message.

You're right that Dataflow is more verbose but it's also mostly more general (even though the arity of its JoinBlocks is only 2 and 3, not more). If you wanted, you could write a simple wrapper for this specific joining functionality, bringing the verbosity on par with your proposal (violating the "don't make it a language feature if you can make it a library" principle).

Also, I believe that within Dataflow, JoinBlocks are not used that often (9 results for "joinblock" on SO, compared with 192 for Dataflow), possibly indicating that joining/waiting on multiple messages is not useful that often.

paulomorgado commented 9 years ago

@HaloFour, take a step back and look at it again.

They are really two methods.

And it's irrelevant if they are:

public whatever MemoryResourceAvailable(MemoryResource memory)
    & FileResourceAvailable(FileResource file)
{
    ...
}

public whatever MemoryResourceAvailable(MemoryResource memory)
    & NetworkResourceAvailable(NetworkResource network)
{
    ...
}

or:

public whatever FileResourceAvailable(FileResource file)
    & MemoryResourceAvailable(MemoryResource memory)
{
    ...
}

public whatever NetworkResourceAvailable(NetworkResource network)
    & MemoryResourceAvailable(MemoryResource memory)
{
    ...
}

Unless you are also proposing |, which would be cool.

HaloFour commented 9 years ago

@svick

It's quite possible that it's not a common problem. I think it's useful but even I have trouble finding use cases that aren't esoteric. Part of me thinks that if we had a tool such as this readily available and understood that we'd find it useful in refactoring existing code or in solving new kinds of problems, like we do with LINQ and async/await now.

I'd love to pick Eric Meijer's brain on this subject. He was involved in both Polyphonic C# as well as Cω and also with Rx so he'd be significantly more of an expert on this subject matter.

@paulomorgado

Well technically it's three methods. To the outside world the class looks like this:

void FileResourceAvailable(FileResource resource);
void MemoryResourceAvailable(MemoryResource resource);
void NetworkResourceAvailable(NetworkResource resource);

The fact that two of them are effectively dumb mailboxes (probably a better term than signal methods, I should poke at F# and update the terminology of this proposal) is a hidden implementation detail.

You are correct that in this case because MemoryResourceAvailable returns void that the implementations are interchangeable. If it had instead returned Task or Task<T> then the semantics would be quite different.

paulomorgado commented 9 years ago

In fact, they are 5 methods. The two that wait for the signaling and the 3 signaling.

If they returned something, I don't theses constructs would be the best ones. It would ned to be more like a semaphore with additional context.

Or a messaging pattern would be better for everything. You send messages to some code that is waiting for messages.

HaloFour commented 9 years ago

@paulomorgado

No, it's 3 methods. Two mailbox methods and one synchronizing method. The fact that internally it switches between two different permutations of which mailbox methods are called is an implementation detail, one that is completely hidden to the outside world.

The entire point of this is a messaging pattern, but one that can join patterns between those messages and respond accordingly. A return value is necessary for it to work as a synchronizing construct. In Cω it would block, but I don't think that fits in as well with asynchrony which is why I proposed that it fit in the guidelines with async methods.

As for a | operator, I'm curious as to what behavior you're thinking would apply there.

paulomorgado commented 9 years ago

Why did you use &?

HaloFour commented 9 years ago

That syntax is straight out of Polyphonic C# and Cω, the specs of which are in one of the first few comments on this thread:

Cω Overview Cω Concurrency Examples Tutorial Cω Concurrency Specification

I made the two following changes to that spec:

  1. I made the mailbox methods require the void keyword so that they are syntactically identical to C# 5.0 async methods except without bodies. In Cω (which predated .NET 2.0) the signature would have been simply public async Foo(int bar);.
  2. I made the synchronizing method async rather than a synchronous blocking method. It's my opinion that this fits better with C# 5.0.

Everything else syntactically is straight out of Cω.

If you're not familiar with Cω it is a very interesting read, especially considering that it was designed before C# 2.0.

paulomorgado commented 9 years ago

I vaguely remember Cω from those days. Now, those links take longer to open than my web browser cares to wait.

gafter commented 7 years ago

We are now taking language feature discussion on https://github.com/dotnet/csharplang for C# specific issues, https://github.com/dotnet/vblang for VB-specific features, and https://github.com/dotnet/csharplang for features that affect both languages.

See also https://github.com/dotnet/csharplang/issues/88