Closed djspiewak closed 9 years ago
BTW, what is the reasoning behind using Trampoline
in Await
and Append
? All of the interpreters are already stackless, so this seems to be just adding overhead.
@pchlupacek I made the changes to stepAsync
on my branch (djspiewak:bug/await-cleanup). However, it looks like wye
is somehow broken by this? I'm not quite sure what's going on, especially since the test in question doesn't really rely on finalizers to my knowledge. Still working on things.
Edit It's because of my change to repeatEval
, which is in turn because of the change to await
. Dear god, why on earth would we create a situation where those two versions of repeatEval
are different?!
@djspiewak I quickly looked on your branch. What exactly is broken on wye?
on the Trampoline question, I think the main reason for it was so we can get stack safety on nested rcv's in await And Append is just essentially collecting these. I think without them we may have troubles in pipe, tee, wye, but perhaps not in Process[Task,O].
However, the cln
may be imho w/o trampoline hence it is guarded by trampolined rcv, and other than that we don't use it in any combinators.
@pchlupacek All tests passing, including some that hit the cleanup logic which is implemented as you specified. Probably some other things could be altered as well, but I didn't go through and do a thorough audit.
The biggest problem is what I referenced previously: Terminated(End)
is a major issue. Specifically, due to the way it interacts with repeatEval
and the way repeatEval
interacts with this change (by creating an infinite Await
body). I think that Terminated(End)
is a terrible idea anyway, and hence I filed #369 to address it directly. In the meantime, I just excluded End
from the toHalt
conversion and added onHalt
to the appropriate places.
The good news is that porting in the test for the mergeN
hang (but not the njoin
fix!) shows that the double-finalizer bug is in fact fixed by this branch, as expected.
on the Trampoline question, I think the main reason for it was so we can get stack safety on nested rcv's in await
As I mentioned, all of the interpreters are stackless, so it shouldn't actually matter with rcv
. It's possible that Append
still needs Trampoline
(especially when dealing with Process0
), but I'm quite certain that Await
does not.
@djspiewak excellent. I just quickly went through branch, and I think we have to remove Terminated possibly alltogether. Specifically I think we don't ned to have it in places where we have full control over the code like async._ stuff.
To me the original behaviour seems correct.
val p = a ++ (b onComplete f)
means that f
is cleanup for b
and when you kill p
process b
doesn't even start - so there is no need to execute f
.
The motivation for this behaviour of kill
is that the process tries to terminate as soon as possible when killed. For this reason even empty ++ p
isn't equivalent to p
(you can use kill
to see the difference).
Even more seriously, the fact that there is an observable difference between the two p variants means that ++ is not associative.
Is there some example where a ++ (b ++ c)
isn't same as (a ++ b) ++ c
?
@radekm I think the problem lies in code like this :
await(resource)(r => doSomethingWith(r) ++ doSomethingElseWith(r) onComplete release(r))
At first the code seems to be correct, but hence you ++
the two processes the onComplete is attached to 2nd instance and thus it may not be executed in case the doSomethingElse
won't be executed.
To me the whole think is more about reasoning about code, obviously if you add the two parentheses, the code will be correct, but this is exactly where we may have unintended behaviours//bugs. I think however the problem is that we swallow cleanup code in ++
. I have been shortly trying to replace ++ code, but so far without reasonable progress, but I think there is possibility this will work. Can you please look on that? I know you have been good on that :-)
The point could maybe be best expressed as that the types lie.
If we have a Process, we expect that if we add onComplete x
then we must
expect the finalizer x will get run, if it is effectively
non-deterministic, then onComplete is a lie – it would better be called
maybeOnComplete
.
So, append is thus a lie, we cannot append in a lawful manner. Therefore we are not really appending, but munging them together, and we would be better served with:
trait Process[…] {
def munge[…](next: Process[…]): MaybeCompletableProcess[…]
}
This more honest type shows that we cannot legally append, so there is no legal SemiGroup and therefore Monoid instance. This is unfortunate, although possibly solved with some weaker type that doesn't promise completability.
On 10 May 2015 at 16:41, Pavel Chlupacek notifications@github.com wrote:
@radekm https://github.com/radekm I think the problem aries in code like this :
await(resource)(r => doSomethingWith(r) ++ doSomethingElseWith(r) onComplete release(r))
At first the code seems to be correct, but hence you ++ the two processes the onComplete is attached to 2nd instance and thus it may not be executed in case the doSomethingElse won't be executed.
To me the whole think is more about reasoning about code, obviously if you add the two parentheses, the code will be correct, but this is exactly where we may have unintended behaviours//bugs. I think however the problem is that we swallow cleanup code in ++. I have been shortly trying to replace ++ code, but so far without reasonable progress, but I think there is possibility this will work. Can you please look on that? I know you have been good on that :-)
— Reply to this email directly or view it on GitHub https://github.com/scalaz/scalaz-stream/issues/333#issuecomment-100594610 .
if it is effectively non-deterministic, then onComplete is a lie – it would better be called
maybeOnComplete
.
IMO it's deterministic. q onComplete fin
means that fin
is executed after q
completes. And in the original example which has the form p ++ (q onComplete fin)
process q
isn't even started so it can never complete so it seems fine to not run fin
.
This behaviour is similar to the behaviour of Scala programs. q onComplete fin
behaves as try { q } finally { fin }
and p ++ q
behaves as { p; q }
. Now you can see that
{
p;
try { q }
finally { fin }
}
is different from
try { p; q }
finally { fin }
@djspiewak any status update on your branch? Can I take it and work from it? or is it complete?
@radekm I fully agree with you, and I think this was our view why we didn't completely focus on this issue early enough.
However unlike the normal program that usually cannot be killed
or interrupted, processes can.
imagine that doSomethingWith(r) will get interrupted with Kill
. In that case cleanup will never be called. However when Kill will be received after ++, then cleanup IS called. So I think @jedws is referring to this as nonDeterminism
as you can't really tell if this will happen or not :-)
To me the whole think is more about reasoning about code
@pchlupacek This is a good point and it makes sense to me (mistakes should be obvious and this unfortunately isn't).
But I still don't see is why isn't ++
associative? Do we have a
, b
, c
such that a ++ (b ++ c)
doesn't behave as (a ++ b) ++ c
?
is unsound precisely because of the non-associativity of append.
@djspiewak It may really happen that the following code allocates a resource and doesn't free it?
eval(allocate) flatMap { r => p onComplete cleanup(r) }
I would expect that this works because there is no ++
involved.
We need to update the algebra. I'm pretty sure we need to separate appends and finalizers.
@djspiewak Some time ago I created an alternative instruction set (https://github.com/radekm/scalaz-stream/issues/1) but I haven't implemented it yet (there are still many unclear things - eg. time complexity of some operations). In that proposal the process is only a sequence of instructions so append is trivially associative because it's append of sequences. Cleanups are solved by finally blocks (delimited by BeginFinally
and EndFinally
) - these blocks aren't skipped when the process is killed. Unfortunately there is no onHalt
and you cannot catch exceptions (this is by design).
@radekm ++ is obviously associative, but not with the termination handler, like onComplete.
++ is obviously associative, but not with the termination handler, like onComplete.
@pchlupacek Then I don't see how the monadic laws are violated - since for some monads you can implement try-finally (eg. bracket
in Haskell).
It is non-deterministic in the sense that if you have any arbitrary Process p, you cannot determine whether adding an onComplete handler will actually result in that handler being run, you must know the provenance of p and if it is a composite process, and which one of the component processes is going to get killed.
With the current behaviour we are definitely in the same position as in the world of promiscuous side-effects that is normal Scala. This hinders reasoning about the program using the types alone as it causes the types to lie – which is why we aspire to the Scalazzi Safe Subset* and hope that libraries adhere to it.
On 10 May 2015 at 19:17, radekm notifications@github.com wrote:
if it is effectively non-deterministic, then onComplete is a lie – it would better be called maybeOnComplete.
IMO it's deterministic. q onComplete fin means that fin is executed after q completes. And in the original example which has the form p ++ (q onComplete fin) process q isn't even started so it can never complete so it seems fine to not run fin.
This behaviour is similar to the behaviour of Scala programs. q onComplete fin behaves as try { q } finally { fin } and p ++ q behaves as { p; q }. Now you can see that
{ p; try { q } finally { fin } }
is different from
try { p; q }finally { fin }
— Reply to this email directly or view it on GitHub https://github.com/scalaz/scalaz-stream/issues/333#issuecomment-100609620 .
you must know the provenance of p and if it is a composite process, and which one of the component processes is going to get killed.
If you write p onComplete fin
and if p
starts then fin
will run after p
completes - no matter what p
is composed of. Yes, ++
isn't associative with onComplete
but this doesn't mean that onComplete
doesn't work.
Proposed behaviour is probably better than the current behaviour but I still don't see how does the current behaviour violate the monadic laws (as noted above Haskell has bracket
which IMO is the same thing as current onComplete
) or how does this relate to parametricity?
The question is, do you expect p onComplete fin
to run fin
or not? If
the answer is 'maybe?' then you cannot reason well about your program.
The lack of associativity means that the Monoid laws are broken.
This of course doesn't matter if we have a type that doesn't promise onComplete handling, as it is only that bit which doesn't append (afaik).
On 11 May 2015 at 17:29, radekm notifications@github.com wrote:
you must know the provenance of p and if it is a composite process, and which one of the component processes is going to get killed.
If you write p onComplete fin and if p starts then fin will run after p completes - no matter what p is composed of. Yes, ++ isn't associative with onComplete but this doesn't mean that onComplete doesn't work.
Proposed behaviour is probably better than the current behaviour but I still don't see how does the current behaviour violate the monadic laws (as noted above Haskell has bracket which IMO is the same thing as current onComplete) or how does this relate to parametricity?
— Reply to this email directly or view it on GitHub https://github.com/scalaz/scalaz-stream/issues/333#issuecomment-100797340 .
The question is, do you expect
p onComplete fin
to runfin
or not? If the answer is 'maybe?'
As I said the answer is yes iff p
is "started" and completes. If p
isn't started or doesn't complete then the answer is no. Generally you want to run finalizer only if the resource was allocated - so I may ask the same - do you expect await(resource)(r => doSomethingWith(r) onComplete release(r))
to run release(r)
or not? And of course the answer depends whether await
runs and whether resource
completes successfully.
The lack of associativity means that the Monoid laws are broken.
IMO monoid has only single associative operation - so I don't see why should be ++
associative with onComplete
(these are two different operations). Also note that ++
doesn't have left neutral element.
@pchlupacek The status of my branch is "basically complete, but lacking the Terminated
change". I need to finish the Terminated
work first, at which point the revised Await
branch will be ready to land.
As I said the answer is yes iff p is "started" and completes. If p isn't started or doesn't complete then the answer is no.
Right, you do not know.
IMO monoid has only single associative operation - so I don't see why
should be ++ associative with onComplete (these are two different operations).
Read the original report again. You can change behaviour of your program by simply factoring out expressions. This is fundamentally against core notions of the value of functional programming.
Also note that ++ doesn't have left neutral element.
I do not know what you mean, if you mean that Process.++ does not, then you are simply stating the fact that ++ does not obey the Monoid laws, and it should.
On 11 May 2015 at 19:04, radekm notifications@github.com wrote:
The question is, do you expect p onComplete fin to run fin or not? If the answer is 'maybe?'
As I said the answer is yes iff p is "started" and completes. If p isn't started or doesn't complete then the answer is no. Generally you want to run finalizer only if the resource was allocated - so I may ask the same - do you expect await(resource)(r => doSomethingWith(r) onComplete release(r)) to run release(r) or not? And of course the answer depends whether await runs and whether resource completes successfully.
The lack of associativity means that the Monoid laws are broken.
IMO monoid has only single associative operation - so I don't see why should be ++ associative with onComplete (these are two different operations). Also note that ++ doesn't have left neutral element.
— Reply to this email directly or view it on GitHub https://github.com/scalaz/scalaz-stream/issues/333#issuecomment-100824463 .
@djspiewak cool. The problem with terminated is repeatEval currently right?
Read the original report again. You can change behaviour of your program by simply factoring out expressions.
I don't see anything like that in the original report. The behaviour was changed by changing parentheses - it's like complaining that (1 + 2) * 3
is different from 1 + (2 * 3)
.
I do not know what you mean, if you mean that Process.++ does not, then you are simply stating the fact that ++ does not obey the Monoid laws, and it should.
I know it doesn't and I agree that it should. I was only saying that the problem is not with associativity of ++
but in not existence of neutral element.
Right, you do not know.
Ok. But IMO you cannot eliminate this uncertainty - so I expect that you'll be able to define onComplete
with similar behaviour even after the fix (at least for processes with Task
).
@pchlupacek No, repeatEval
is easy. The problem is through
.
@radekm I missed a comment you made earlier. eval . flatMap
most certainly is not safe and cannot provide the same guarantees as await
since there is no lexical bounding. Under the surface, flatMap
is using ++
, as in fact you would expect from a join
on a sequence.
@djspiewak I think we sort of mix two issues together.I am not sure if ++ associativity with cleanup is achievable, because I think @radekm arguments are valid specifically the *
explanation he made.
The bigger problem I see is the issue that flatMap is not resource safe, and I think that's something we perhaps can resolve. I see this more like incorrect flatMap implementation. The reason for flatMap being like this was mainly performance issues, but perhaps we can do better and make it more low-level in case of (emit).
How about start with 2-3 flatMap examples that are not valid and try to resolve them ?
What do you think?
@pchlupacek I do agree that the ++
problem is unresolvable. Unfortunately, I think that flatMap
is either going to be the same or we're going to end up generating a flatMap
which is inconsistent with ++
, and the latter alternative is even scarier. I'm willing to try though!
Here's a simple example to work from:
val p = eval(acquire) flatMap { r =>
doThings(r) onComplete release(r)
}
val interrupt = p stepAsync { _ => () }
// wait for acquire to start
interrupt(Kill)
// release(r) will not be run at this point, whereas it is guaranteed to run if we use await
Basically, take the await
properties in ProcessSpec
and just rewrite them to use eval . flatMap
and they will obligingly fail.
ok the
val p = eval(acquire) flatMap { r =>
doThings(r) onComplete release(r)
}
//is in fact
val p0 = Await(acquire, {
case -\/(c) => Halt(c)
case \/-(a) =>
//Emit(Seq(a)).flatMap{ r => doThings(r) onComplete release(r) }
doThings(a) onComplete release(a)
})
I don't see where ++ is involved here. So I am not sure if this fails if this is really related to append. Any thoughts?
Ah, I perhaps see here what do you mean, is this asynchronous release we have resolved with additional cleanup. Hm, I think for this sort of code we have only option to use await for now. I think resource has to be guarded by await where it is acquired, and we shall not do it in flatMap for now. I see your point tho.
@djspiewak after more thinking of this how about introducing
def evalAsync[F[_], O](fo:F[O], onPreempt: O => Process[F,Nothing] = _ => halt):Process[F,O] = ???
I think this will resolve the issue with flatMap as you may be able to write
evalAsync(request,onPreempt).flatMap { o => ....}
we may perhaps add this onPreempt
to eval, and eval_, but, I think explicit syntax is perhaps better.
Additionally, we may have to do a bit of implicit play and reincarnate your idea with Eval
but just for infix syntax
case class Eval[F[_],O](fo:F[O], onPreempt:Option[O => Process[F,Nothing]])
object Eval {
def eval2Process[F[_],O](in:Eval[F,O]):Process[F,O] = ???
implicit class EvalSyntax[F[_],O](val self:Eval[F,O]) extends AnyVal {
def onPreempt(f: O => Process[F,O]):Process[F,O] = ???
}
}
@pchlupacek Actually given that we now have an onPreempt
case within Await
in master, and we no longer attempt to identify finalizers lexically within the Await
body, there may not actually be a problem here any longer. There certainly shouldn't be an observable difference between the following renderings:
eval(acquire) flatMap { r => doThings(r) onComplete release(r) }
// ...and
await(acquire) { r => doThings(r) onComplete release(r) }
Let me think about this a bit and see if I can come up with an example that breaks flatMap
despite the new preemption handling.
I think you are right. The only difference is that await
allows you to add the onPreempt whether eval
does not. Hence at least in our code base eval
is used more frequently then just await
(except for P1s) I think eval deserves onPreempt as the special parameter or as syntax.
@pchlupacek I don't mind adding it as a special parameter. It would be less clunky than await
. I would recommend using first-class overloading rather than a default parameter, since in my experience it is not uncomment to do something like p flatMap Process.eval
and similar, which would break if eval
had a second defaulted parameter.
Another option would be to come up with a name for the preempt-assigned version of eval
. Not sure what would make sense though.
I'm going to close this, since I'm relatively certain we've resolved it.
In its current configuration, the above test fails. If I swap the commented lines, which is to say swap the associativity on the
++
operator, the test will pass. In other words, binding a finalizer to a non-halt
process within an append breaks the guarantees about finalizer evaluation.Even more seriously, the fact that there is an observable difference between the two
p
variants means that++
is not associative. As an extension of this,flatMap
is also non-associative. It was previously known thatflatMap
was non-associative in the presence ofnjoin
(#274). This test case shows thatnjoin
is unnecessary to demonstrate failure of associativity with monadic joining.Needless to say, this is an undesirable state of affairs. It makes it difficult to reason about finalizers in general, and can mean that naively extracting a process definition into a variable (thus altering the associativity) can change the semantics. Even worse, a given process definition will have different semantics depending on the associativity with which it is composed into another process!
I believe all of these issues come down to a single design flaw:
append
andonComplete
are implemented using the same mechanism (onHalt
), which is overly general. We need to split upappend
from finalization. Doing this correctly should resolve the associativity of++
issue described here (and by extension,flatMap
). Resolving thenjoin
associativity issues would require we take the further step of restricting finalizer processes to be of typeProcess[F, Nothing]
(in fact, we probably need to do this even for++
associativity, but I haven't thought that through).So the tl;dr here is that the monadic laws do not hold for
Process
in the presence of finalizers. Or rather, they do not hold in the presence of finalizers for a "strict" definition of process equality. The above assumes we define two processes to be equal if they a) produce the same elements in order, and b) perform the same effects in order. If we drop the second requirement, then magically the monadic laws (and++
associativity!) holds once more. However, I would argue that this weaker definition of process equality is somewhat useless, given the primary use-cases for effectful streams.At the end of the day, we want to use
Process
as a way of reasonably managing effects, and it is difficult to reason about a "monad" for which the monadic laws do not always hold.