Closed TysonMN closed 6 years ago
Match
We can simply the problem slightly by defining Fold<>
in terms of Match<>
.
[Pure]
public Func<Unit, S> Fold<S>(Task<A> ma, S state, Func<S, A, S> f) => _ =>
default(MTask<A>).Match(ma, a => f(state, a), () => state);
This version of Fold
seems bug free to me while Match<>
has the same issues as the current version of Fold
. Maybe both of these methods behave as intended, and I should be using FoldAsync
instead.
FoldAsync
I can imagine a potential reply saying to use FoldAsync
instead of Fold
. It is true that the test passes when using FoldAsync
. However, I created the above test while trying to simplify this one.
[Fact]
public void SequenceOnTaskSeq_InAsyncContexAndWithSingletonSeq_Terminates() =>
AsyncContext.Run(() => SequenceOnTaskSeq_WithSingletonSeq_Terminates());
private static void SequenceOnTaskSeq_WithSingletonSeq_Terminates() {
var source = 0
.AsTask()
.Apply(SeqOne)
.Sequence();
source.Sequence();
// execution terminates by reaching here
}
This test eventually calls Fold
in a manner similar to the test that I originally gave. I don't see any method like SequenceAsync
that would eventually call FoldAsync
instead of Fold
. Maybe this is the real bug?...That the code in this test should eventually call FoldAsync
instead of Fold
?
On a related note, this test passes by throwing a TaskCanceledException
. Is that the expected behavior?
[Fact]
public async Task MTaskFoldAsync_CancelledTask_ThrowTaskCanceledException() {
var cts = new CancellationTokenSource();
cts.Cancel();
var intTask = Task.Run(() => 0, cts.Token);
var f = default(MTask<int>)
.FoldAsync(intTask, 0, (x, y) => 0);
await Assert.ThrowsAsync<TaskCanceledException>(() => f(Unit.Default));
}
I don't think I would call it a bug so much as just bad practice to call that overload of Fold. You're asynchronously getting values in a synchronous method, so a deadlock seems pretty natural to occur there, unless I'm missing something.
The fact that calling Sequence twice on a Seq<Task<int>>
deadlocks does seem like a bug however.
I went ahead and made a repo that demonstrates this issue: https://github.com/faeriedust/languageExtTests
You're both right for different reasons. Yes, the quick fix would be to asyncify the code. But what I've actually started doing is looking into breaking Monad
apart into Monad
and MonadAsync
. The attempt to unify the two has lead to a slightly messy interface and compromises like this implementation of Fold
for MTask
; it should be FoldAsync
that returns a Task
.
I have a busy few days coming up, but hopefully I should have something by the end of the week.
Yeah, I tried to fix the bug in the repo and immediately ran into many issues...
I think breaking Monad into Monad and MonadAsync does seem likely the much better approach.
I'm looking forward to this. :)
@faeriedust @bender2k14 Just a heads up on the progress of this. The work is being done on the async-refactor
branch, it's becoming quite a large refactor that will almost certainly lead to a significant improvement for the async story.
Here's the first commit to give you an idea of the scope of the changes: https://github.com/louthy/language-ext/commit/96e83efbbd398575b5b5cab1f763c7b6b1edbe88
The unfortunate knock-on effect is with the transformer types (nested monads). Synchronous and asynchronous monads are not directly compatible with each other now, and so I've split the HKT t4 templates to generate the extension methods for async-async pairs and sync-sync pairs.
The main issue with that is something like Lst<TryAsync<A>>
won't get the transformer extension methods. This will definitely cause problems for users of the library, so I'm just looking into ways to still pair up sync and async types where it makes sense. It should be a solvable problem, just needs a bit more thinking time.
@bender2k14 @faeriedust The latest release deals with the issue above in that default(MTask<int>).Fold
does not exist any more and won't exist for any asynchronous type. Calling default(MTask<int>).FoldAsync
exhibits the behaviour you'd expect.
I see that you have removed all the calls to properties of Task<>
, such as Result
and IsFaulted
. That is an extremely good improvement just by itself.
I have two questions about he new design.
On the interface FunctorAsync<FA, FB, A, B>
, there are two methods with signatures
FB MapAsync(FA ma, Func<A, B> f)
andFB MapAsync(FA ma, Func<A, Task<B>> f)
.Can you explain the reasoning for calling them both MapAsync
? In your alpha release notes, you say
To reduce the problems of method resolution a large number of methods have been renamed to have
Async
suffixes...
Is this reasoning used on these method names as well? Instead of calling them both MapAsync
, I was wondering if there would be any problems calling them both Map
or calling them Map
and MapAsync
respectively. Do you know if either of these other choices allows for method resolution problems?
If there is no technical limitation to calling the first method above just Map
, then the possibility exists to having the interface FunctorAsync<FA, FB, A, B>
extend the interface Functor<FA, FB, A, B>
.
Would this inheritance be good or bad?
@bender2k14 Actually that's an oversight. I started by making all async interfaces have ...Async
methods. But then after hitting some issues where the C# compiler just can't work out the difference between an A
and a Task<A>
I started using ...Async
just for higher-order functions that have a asynchronous Funcs
or return a Task<A>
.
I've been so knee deep in the transformer system that I will probably need to review all the naming before pushing it to beta status. Initially I wanted to avoid issues with a type implementing both the sync and async interfaces for say Monad or Functor.... the requirements slightly changed as I was going along.
I think the final policy will be:
Task
then append Async
. That allows for say a synchronous int Sum()
and an asynchronous Task<int> SumAsync()
for example.Task
then Async
is appended.Possible policy:
Sync
or Async
). i.e. value.Map(Sync: x => x + 1);
. This would help for methods like BiMap
where there are four possible variants. Something like: value.BiMap(LeftSync: x => x + 1, RightAsync: y => y + 2);
This will be a be consistency issue throughout the code-base I expect (from pre-existing async functionality). So, it will need some time to iron all of that out. It's a good time to get this stuff done and out of the way (because of the other breaking changes).
On your second point - you could be right, I don't have the code in front of me at the moment (about to hit the hay after looking at a screen for too long). If they unify then that would certainly be a benefit.
If the method returns a
Task
then appendAsync
. That allows for say a synchronousint Sum()
and an asynchronousTask<into> SumAsync()
for example.
The example is good. Specifically, if there are two methods that do the same thing except one returns A
and the other returns Task<A>
, then appending Async
is good.
However, I don't like the idea that the name of any method that returns a Task
should end in Async
. For example, I don't think the extension method AsTask
that lifts an A
to an Task<A>
should be called AsTaskAsync
.
I like the idea of appending Async
only if there is also a synchronous variant. This idea may well have exceptions though.
This would help for methods like
BiMap
where there are four possible variants.
I would typically think of handing that by lifting one delegate to return a Task<>
to match the other delegate and then calling the overload that takes two delegates that returns Task<>
. Or await
when able before calling a synchronous function. Do we have to be worried about the performance tradeoff from a additional method call like that?
If they unify then that would certainly be a benefit.
The only difference is the name of the method (Map
vs MapAsync
). My current understanding is that they should be unified.
I would typically think of handing that by lifting one delegate to return a Task<> to match the other delegate and then calling the overload that takes two delegates that returns Task<>.
There's an additional allocation cost which would be saved by using named arguments. Where possible I try to reduce unnecessary allocations - not that this library is particularly light on the allocations front, but I try to minimise them where I can.
I think also there's the real-world use-case to consider. Often with choice types where they're bi-mappable, or bi-foldable, the 'alternative' value is used for the error case (not always, obviously, but mostly). I suspect that the most common usage is for choice 1 to be asynchronous where choice 2 would be synchronous (perhaps just giving context to an error value for example). Choice 1 is the 'happy path' where all the heavy lifting is happening.
There is also already precedent for using named arguments for Match
and the like, so I'm comfortable with using them to aid the selection of the correct overload.
However, having said all of that, I wanted to clear some ideas through my head first, so I'm not just arguing from a preference point-of-view, but from a 'good for the library' point-of-view.
Think about this situation:
static async Task<int> AddValue(int x) =>
x + await GetValue();
static Error AddContextToError(Error err) =>
err.AddSomeContext();
ma = ma.BiMapAsync(RightAsync: AddValue, Left: AddContextToError);
That has a certain amount of elegance to it because it's not cluttered with lambda grammar. It also has no additional allocation costs over what the AddValue
and AddContextToError
functions use in their implementations.
If BiMapAsync
only took Task<R>
and Task<L>
then the Left
side would have to be lifted:
ma = ma.BiMapAsync(Right: AddValue, Left: err => AddContextToError(err).AsTask());
That instantly becomes much less elegant. There is also an allocation for the lambda and for the Task
. Finally, there's the cost of constructing a Task
- probably negligible, but but if the Left
type is a value-type then we're essentially boxing something that doesn't need boxing.
I took the thought experiment a bit further to see if I could provide some improvements on the elegance front, by providing some common lifting functions:
static Func<Task<A>> liftTask<A>(Func<A> f) =>
() => f().AsTask();
static Func<A, Task<B>> liftTask<A, B>(Func<A, B> f) =>
a => f(a).AsTask();
// ... etc ...
These would work like fun
or act
but instead they would lift the synchronous function so it could be used in the various async methods. However, it suffers from the exact same problems as fun
and act
and that is C#'s terrible handling of method-groups. The type system refuses to infer the correct method, even when there is only one possible candidate.
And so, what should be this:
ma = ma.BiMapAsync(AddValue, liftTask(AddContextToError));
Which has the elegance. Instead ends up like this:
ma = ma.BiMapAsync(AddValue, liftTask<Error, Error>(AddContextToError));
Which is ugly. It also suffers from the same allocation problem as providing a lambda.
Hopefully one day C# will treat methods in method-groups as first-class citizens and provide a decent inference story, but I'm not holding my breath. In the meantime I feel the named arguments approach is likely to make life easier for devs.
However, I don't like the idea that the name of any method that returns a Task should end in Async. For example, I don't think the extension method AsTask that lifts an A to an Task should be called AsTaskAsync.
I think I mostly agree here. Context is important. For example, Match
in EitherAsync
returns a Task<R>
but has no delegates that return a Task
. The type EitherAsync
by default makes all operations on it asynchronous. So, it makes sense for Match
to not have Async
appended.
However, that general rule falls apart slightly for the ToString
and GetHashCode
overrides on EitherAsync
. They can't have their names or signatures changed (because they're inherited from Object
) and they will force synchronous evaluation. I have provided ToStringAsync
and GetHashCodeAsync
, but that's immediately breaks the rules set above.
Ideally the rules wouldn't have too many caveats so they can be easily followed and maintained. I'm working through the interfaces this evening, so I'll try and come up with a concrete set that works for all types.
I like your change from last night (hash d3c2ccb). Specifically,
Map
-like method the accepts a Func<A, B>
is called Map
Map
-like method the accepts a Func<A, Task<B>>
is called MapAsync
.I think it is consistent with the monad design pattern to call the methods as you have because the method name varies with the argument types but not with the monad type.
Recall, for a monad M<T>
, that
Func<A, B>
is called Map
andFunc<A, M<B>>
is called Bind
.So this method MapAsync
is lies on the interval between Map
and Bind
. I suppose any combined monad would face a similar naming issue. Looking at TryOption<>
though, I don't see any method that accepts just one argument that is either Func<A, Option<B>>
or Func<A, Try<B>>
.
I like your change from last night
Yeah, it's starting to feel right. I still need to go through the other async types (TryOptionAsync
, TryAsync
) and update them as well as the async extensions for Option
and TryOption
. This is one of those 'by convention' things that will just need a bit of time and care to refine.
So this method MapAsync is lies on the interval between Map and Bind.
I know why you say that, but I see the Async variants more as artefacts of C#'s strict evaluation and less of a 'sliding scale of Map and Bind'.
If values were lazy like Haskell then I wouldn't need to litter the code with Task<...>
and the implementations of Map
and Bind
could deal with the asynchronous behaviour transparently. Unfortunately C# doesn't work that way and so we end up with async polluting the function signatures too. So, although MapAsync
and BindAsync
isn't a purist definition of a functor map
and monad bind
operation, the end result is still a functor map
operation and monad bind
operation, just with asynchronous behaviour.
Looking at
TryOption<>
though, I don't see any method that accepts just one argument that is eitherFunc<A, Option<B>>
orFunc<A, Try<B>>
.
I wouldn't expect that though. They are not the definition of Map
or Bind
. TryOption
has:
TryOption<B> Map<A, B>(this TryOption<A> self, Func<A, B> f)
TryOption<B> Bind<A, B>(this TryOption<A> self, Func<A, TryOption<B>> f)
I would expect the dev to lift any Option
or Try
into a TryOption
to use with the standard Bind
operation.
var x = TryOption(Some(123));
var y = Try(123).ToTryOption();
I know why you say that, but I see the Async variants more as artefacts of C#'s strict evaluation and less of a 'sliding scale of Map and Bind'.
If values were lazy like Haskell then I wouldn't need to litter the code with Task<...> and the implementations of Map and Bind could deal with the asynchronous behaviour transparently. Unfortunately C# doesn't work that way and so we end up with async polluting the function signatures too. So, although MapAsync and BindAsync isn't a purist definition of a functor map and monad bind operation, the end result is still a functor map operation and monad bind operation, just with asynchronous behaviour.
This makes a lot of sense to me. In particular, synchronous computation is a special case of asynchronous computation. This is clear from C#. Asynchronous computation can easily call synchronous computation but there is no completely correct way for synchronous computation to call asynchronous computation.
However, that general rule falls apart slightly for the ToString and GetHashCode overrides on EitherAsync. They can't have their names or signatures changed (because they're inherited from Object) and they will force synchronous evaluation. I have provided ToStringAsync and GetHashCodeAsync, but that's immediately breaks the rules set above.
I think this exception to the rule is acceptable. Another method with this problem is IEnumerable<T>.GetEnumerator
, which, in your design, is on every monad.
I see that you have removed all the calls to properties of
Task<>
, such asResult
andIsFaulted
. That is an extremely good improvement just by itself.
I said that 4 days ago, presumably in reference to state of your branch at commit 39b7400. Somehow I made a mistake. There are still many calls to Result
and IsFaulted
in the current commit.
Do you intend to remove them?
I don't see how your asynchronous monad types can correctly implement synchronous methods like ToString
, Equals
, GetHashCode
, and GetEnumerator
.
However, that general rule falls apart slightly for the ToString and GetHashCode overrides on EitherAsync. They can't have their names or signatures changed (because they're inherited from Object) and they will force synchronous evaluation. I have provided ToStringAsync and GetHashCodeAsync, but that's immediately breaks the rules set above.
Why do you have to evaluate for these methods? Just use Object.ToString() and Object.GetHashcode(), no?
I said that 4 days ago, presumably in reference to state of your branch at commit 39b7400. Somehow I made a mistake. There are still many calls to Result and IsFaulted in the current commit.
Do you intend to remove them?
I just reviewed the current version (which is commit 966d9bfecaeb84668338564c3a841ea0cd611ace) of your async-refactor
branch. I think you (@louthy) have fixed (almost) all the possible deadlocks that could happen by calling Task.Result
. The only methods in production code (I didn't check tests or sample code) that could deadlock are EitherAsync.ToString
and EitherAsync.GetHashCode
because they both call Task.Result
on the Task
returned from their async
counterparts. I think you did that intentionally, and I think it is reasonable to do so.
Good work!
Oops. Didn't mean to do that.
I disagree that it is reasonable.
Task
Suppose my task is going to take 10 minutes, why should I wait for 10 minutes, or even deadlock my program, just to see the text in the debugger or similar?
If it's asynchronous, it's specifically because I chose to delay evaluation, so don't evaluate it for me, unless I need the value, which .ToString() semantically does not require in other CLR types, such as Task
Previously in this issue we were discussing the naming of methods like Map
and Bind
depending on whether or not their argument types include Task<>
.
@louthy, I really like the idea you shared in your v3.0 release notes. There, when talking about this naming, you gave this example.
Match(SomeAsync, None)
Match(Some, NoneAsync)
Match(SomeAsync, NoneAsync)
said that sometimes the method would keep its normal name (in this case, Match
) and one can explicitly disambiguate among the overloads by explicitly naming the arguments.
I just wanted to say that I think this is a great idea. This approach should allow the name to be simple (i.e. just Match
instead of something like MatchAsync
) in most cases without any problems (i.e. the compiler will select the indented overload). Then in the rare cases where things are ambiguous to the compiler, we have a simple way to clarify the ambiguity (by explicitly naming one or more arguments). Furthermore, we don't have to come with an exponential number (relative to the number of parameters) of different method names in order to avoid ambiguous situations for the compiler.
P.S. Since the original purpose of this issue was a bug that is now fixed, I think this issue can now be close.
Great work @louthy! :)
Example Code
Here is a test that exhibits a deadlock.
The type
AsyncContext
is from the NuGet package Nito.AsyncEx.Expected Behavior
As stated in the test, the expected behavior is that the test terminates.
Actual Behavior
The test does not terminate. Here is the source code under test.
The test does not terminate because of a deadlock that is caused when calling
ma.Result
.Possible Fix
Stephen Cleary, an expert on this topic and author of the aforementioned NuGet package Nito.AsyncEx, explains why this deadlock occurs. To summarize, my understanding is that it is a best practice to not call
Task.Result
unlessTask.Status == TaskStatus.RanToCompletion
. This is not true in the example that I have provided. Instead it is in the stateTaskStatus.WaitingForActivation
.Otherwise, the best practice is to
await
the task after callingTask.ConfigureAwait(false)
. I don't see how you can do this without changing the return type ofFold
. If I may adjust the return type slightly fromFunc<Unit, S>
toFunc<Unit, Task<S>>
, then I would implement Stephen's best practice suggestion like this.This new return type seems more natural to me. My understanding of monads is that you are not allowed to just extract the underlying value of the monad. Unfortunately,
Fold
is doing just that when it callsTask.Result
. However, I have only recently started using monads and your library, so I know that I still have a great deal to learn and that it is very possible that I am misunderstanding something here.Thank you for considering my bug report. I am interested to hear your thoughts about it.