Doesn't work with Tasks that execute synchronously?

mbenoni7 commented 7 years ago

This library looks useful, but it seems like it doesn't properly handle Tasks that execute synchronously in the thread of the caller, which leads to stack overflows.

The easiest way to show you what I mean is to make a small change to one of your test cases (readFile function, starting on line 22 of Program.fs in BenchmarkFS):

let readFile path =
      task {
            let buffer = Array.zeroCreate bufferSize
            use file = File.OpenRead(path)
            let mutable reading = true
            let mutable n = 0
            while reading && n < 10000 do
                let! countRead = Task.FromResult(1) // file.ReadAsync(buffer, 0, buffer.Length)
                reading <- countRead > 0
                n <- n + 1
        }

All I did was comment out your truly asynchronous task (file.ReadAsync(...)) and replace it with a dummy task (Task.FromResult(1)) that will execute synchronously. Then I added the n counter to ensure that the while loop will iterate enough times to generate the stack overflow (10,000 is enough on my machine).

Unfortunately I don't know enough about CEs or the TPL to suggest whether this is a small bug or a potential flaw in your whole approach.

rspeele commented 7 years ago

Hi,

Thanks so much for reporting this issue!

My first observation is that this issue does not present itself if I build in release configuration, or in debug configuration with "generate tail calls" enabled under project properties -> build. So that's one workaround.

I will take a look and see if I can fix it without tail calls.

rspeele commented 7 years ago

This should be resolved by https://github.com/rspeele/TaskBuilder.fs/commit/b804b6f7316b7f569eb5deff4aba54ef7cefa0a7 .

Let me know if it works for you so I can close this.

mbenoni7 commented 7 years ago

Wow, great turnaround. That does indeed fix the issue.

One more question: is this intended to support the recursive CE style as well? If so, there's a similar problem there. For example, if I again rewrite the readFile function from above ...

  let readFile path = task {
        let buffer = Array.zeroCreate bufferSize
        use file = File.OpenRead(path)
        let rec loop n = task {
            if n < 10000 then
                let! countRead = file.ReadAsync(buffer, 0, buffer.Length)
                return! loop (n + 1)
            else
                return ()
        }
        return! loop 0
    }

... this also gives me a stack overflow (in release mode, BTW).

rspeele commented 7 years ago

Another good catch -- I'll take a look at this one tomorrow.

rspeele commented 7 years ago

After mulling this over for a couple days, I don't think it can be done. At least, not with TaskBuilder returning a Task<'a>. It could probably be made to work if TaskBuilder returned its internal Step monad and let you convert that to a task separately, but I think keeping it simple for translation from C# -> F# is too important to do that. So I have chosen to document the behavior in the README instead of change it. I am keeping the refactoring I did while working on this though, since it does at least improve the situation a little.

My reasoning for why it can't work:

If you have some async work followed by a tail call into another task, and you don't know anything about the tail position task other than that it is some Task<'a>, then what you have must be a Task<Task<'a>>.

This is now what the async state machines in TaskBuilder.fs use. The built-in way to convert a Task<Task<'a>> to a Task<'a> is task.Unwrap(), which again, is what TaskBuilder.fs uses now. However, using this at best grows memory usage proportional to the number of iterations (on .NET), and at worst still stack overflows (on Mono). This is because Unwrap, despite its name, does return a wrapper Task to represent its computation and so you end up with a wrapper around a wrapper around a wrapper ... etc. When the innermost wrapper completes, it all unwinds (and may or may not stack overflow in the process), but till then you are at least eating a bunch of heap memory for all those wrappers.

mbenoni7 commented 7 years ago

Interesting. The Task builder in FSharpx.Extras supports this usage, but according to some casual benchmarks of mine, that builder is significantly slower than this one (and generates a ton more garbage). That suggests to me that they have accepted some of the negative performance trade-offs you allude to in your last paragraph.

I can live with a more imperative coding style to get closer to C#-level performance, so I appreciate your contribution. I do hope that one day somebody figures out how to support the idiomatic style with good performance, though.

rspeele commented 7 years ago

Unfortunately, I think the FSharpx.Extras implementation suffers even worse. Because they always use Task.ContinueWith(...).Unwrap() for Bind instead of creating a state machine, there is no way to write a correct long-running loop using that builder.

With FSharpx, both of the below code samples will eventually fail with an OutOfMemoryException. You can watch the memory usage grow into the gigabytes over the course of a minute or two. That memory is being taken up by nested tasks that legitimately cannot be collected (wrappers around wrappers around wrappers).

recursive:

let tailRec () =
    task {
        do! Task.FromResult(())
        return! tailRec ()
    }

iterative:

let whileLoop() =
    task {
        while true do
            do! Task.FromResult(())
    }

With TaskBuilder.fs, the recursive version fails with a StackOverflowException (although if you change Task.FromResult to Task.Yield you'll get an OutOfMemoryException just like the one in FSharpx).

However, in the current TaskBuilder.fs, the iterative version has very stable memory usage. It does allocate as it runs, but the allocations can be collected so quickly that you basically don't see memory usage change after the first few seconds of runtime.

I should probably report the while loop behavior as a bug to FSharpx. It might be awkward for them to fix while keeping the same public API, since their builder lets you specify TaskContinuationOptions, which only make sense when using Task.ContinueWith for bind...

rspeele / TaskBuilder.fs

Doesn't work with Tasks that execute synchronously? #1