Closed aryairani closed 9 years ago
yeah, data is the place, go for it!
Great, I started — how deep should I go? State
and/or StateT
and/or IndexedStateT
and/or MonadState
?
So far I've done a little of each, but I guess I shouldn't waste my time on say, State
alone, unless it's desired.
I guess IndexedStateT
is very nice because it's very general, but on the other hand, it's harder for newcomers to understand, compared to State
. But if there are performance or type-inference benefits to having explicit versions of all three, I can do that too.
P.S. I'm not super confident with MonadState
yet.
If I don't hear back, I'll finish fleshing out State and StateT and later we can look at the PR and decide.
Yeah, I don't totally know the answer. I guess I would aim for the smallest consistent piece of work, to minimize the amount of possible refactoring you/we may need to do. Since the project is in new and in flux we are trying to maximize throughput and minimize latency rather than minimizing commits/work. :)
nod
Pretty please implement StateT
as F[S => F[(S, A)]]
and not S => F[(S, A)]
! The additional ugliness can be hidden from view so users don't have to see it.
Scalaz' state monads cannot be trampolined effectively because recursive application of the state function is not guarded by the base monad.
I don't have an opinion on this, and am happy to do it either way, although I just finished with S => (S,A)
and S => F[(S,A)]
. I do want to be able to trampoline with State; you say this will allow it? :) Could more folks weigh in?
P.S. The State examples I wrote are blocked by #140
@jdegoes Should they be two separate things?
@refried I would prefer if they are not two separate things, because you can expose the exact same interface on top of the "more powerful" state and hide the slightly more complex internal representation.
The reason this representation can be trampolined is because in order to pass a value to the function, you must first lift it into the monad, which in a trampoline becomes an ADT interpreted later. So you cannot actually apply the state function without first yielding to the interpreter.
@jdegoes So to the question of whether or not they should be two separate things, it's the same argument as whether State and StateT should be two separate things, right? (DRY vs type inference problems if applicable)
Also, correct me if I'm wrong, but it looks like most StateT1 methods only require F:Functor, while StateT2 operations require F:FlatMap. If true, I'm not sure whether this is something anyone cares about.
@refried Yes, I guess it's about the same argument, as the more trampoline-able representation of State is also a bit slower. That said, the number of programs which can operate in the state monad without recursion is probably very small, i.e. almost everyone would be using the trampolined version.
I'll work on getting them all ready.
Ok now examples are blocked by absence of examples project; @stew please advise :)
@jdegoes / @mpilquist / @non / whoever https://github.com/refried/cats/compare/wip/state-impl
I know there's duplicated functionality; I can delegate to one implementation if desired, or rename things.
btw, I don't like the names eval
and exec
because I don't know how their meanings are different.
@jdegoes Would you be willing to write up something for dummies (me) on trampolining State? And if so, could you describe how the situation would be different if we had native TCO?
@refried Here's the start of an implementation of State
in cats based on Runar's free monad paper: https://gist.github.com/mpilquist/eec3764fc287da70e269
We could extend this to StateT
and define State
as an alias for StateT[Id, S, A]
. I'd much prefer trampolining to be built in to StateT
than having to manually stack on Free
every time I use it.
@refried examples are now in the doc sub-project, look at docs/src/main/tut, the "tut" task typechecks them, the makeSite task generates the jekyll website in docs/target/site, you can run jekyll locally to see the results by changing to the docs/target/site directory and running jekyll serve
Unfortunately, trampolining can't be baked into any transformer, it has to be part of the base monad. It can be baked into State
(which is probably a good idea if you subscribe to "safe by default" design philosophy), but not StateT
. Further, any implementation that takes the representation S => F[(S, A)]
isn't stack-safe under recursive application of the state function (you can look at Scalaz StateT
to see a number of these cases).
Finally, I will add a write-up on trampolining state to my todo list, which unfortunately makes it the 3rd such todo (keep bugging me :)). Native TCO isn't much use for monadic tail recursion, although purescript-tailrec has taken a very interesting approach I'd like to see replicated or expanded on in Cats: the idea of a monad which is "tail recursive" -- i.e. for which you can express recursive computation as a monadic unfolding on some initial value.
Recursive monadic computations expressed this way can have an impure (or TCO) interpreter which can run forever in a stack safe way, with much higher performance than trampolines.
What are some common base monads used with StateT in practice?
@refried I've used IO
@wedens @jdegoes if IO is trampoliney (and it is, right?), then is there any problem? On Mon, Feb 9, 2015 at 03:50 wedens notifications@github.com wrote:
@refried https://github.com/refried I've used IO
— Reply to this email directly or view it on GitHub https://github.com/non/cats/issues/122#issuecomment-73474835.
@refried I often use EitherT[StateT[Future, S, ?], E, ?]
Future
, IO
, Task
, and Free
are common base monads (which are all trampolined and stack safe with the alternate formulation of StateT
suggested above), and also monad transformers based on these (for example, stacking StateT
on ErrorT
on Task
).
Ok. So we could do one of:
State
, and the trampoline-friendly StateT
(StateTv2
in the WIP)? State
from StateTv2
State
+ StateTv2
for heavy-duty FP
@jdegoes Ok, so MonadRec
only helps for recursive tail calls, but trampolining helps for general tail calls, right?
@refried Trampolining is useful not just for tail calls, but any kind of recursion, even mutual recursion. Whereas MonadRec
is only useful for tail call self-recursion.
@jdegoes I thought that's what I said :-)
MonadRec
seems less useful (or not useful?) in Scala compared to PureScript, because if we use for
-comprehensions, we always have that yield
/map
at the end. Is that right?
Edit 2: (I'm fiddling with a port of MonadRec and 4 different State implementations, but I don't have too much experience with this stuff, and struggling to even put a stack-blowing example together.)
@refried Sorry, the part I was emphasizing is that trampolining is magical even for non-tail calls (of any kind). But yeah, same as what you're saying.
MonadRec
is still useful as long as you don't perform the recursion inside the for
-comprehension. The recursion is made explicit (it's really an unfold) in the return value of the function, which indicates to the parent loop whether or not to continue the "recursion".
@jdegoes Thanks.
Right, so the function either returns loop(args for recursion)
or done(result)
I am starting to see that MonadRec
looks like StateT
!
@jdegoes I'd like to eventually understand the uses, strengths, and weaknesses of the different models, and get something nice going with the goal of being able to write fast, scalable FP. I made some headway tonight (wip branch), but it's still a jumble.
If you have stack-safe recursion or no recursion, then it makes sense to use the non-trampolined monads. OTOH, in many cases, the depth of the recursion is determined by values only available at runtime, so it's hard to guarantee stack-safe recursion for many use cases.
The non-trampolined monads will be the fastest, but also the least safe.
If you don't have stack safe recursion, but it's self-recursion and you can model it as an unfold, then it makes sense to use MonadRec
. It should be as fast as the above, but completely safe. However, sometimes it's a bit more challenging to express recursion in this way.
Finally, if you don't have stack-safe recursion, and it's either not self-recursion or you don't want to model it as an unfold, then use a trampoline. The trampoline will be very slow.
Sorry I've let this sit for a while.
We've got these 4-5 options; I wanted to collect some benchmarks before saying that 1-3 can be replaced by number 4.
Id
-based State: S => (S,A)
Id
-based State:
StateT
: S => F[(S,A)]
StateT
: F[S => F(S,A)]]
F
is.MonadRec
unfold
I have the feeling that my benchmark is needlessly complicated though. If someone has a good use case to benchmark instead, I'd be happy to swap it in and run them and report back.
I think we will at least want number 4: StateT: F[S => F(S,A)]]
.
Some people may want number 1 for specific use cases, but I think generally this sort of thing is a ticking time bomb in a code base. I say we come back to this if someone really wants it.
Number 2 is a special case of number 4.
Number 5 sounds interesting and I should look into it more. We may want this in addition to number 4. However, I think we need something that's useful in the general case of recursion (not just self-recusion).
@jdegoes thank you for all of the great input. I learned a lot from reading your comments.
@refried thanks for all of your work on this. I'm really looking forward to having State, because I keep finding myself wanting to use it in tests and examples :).
@ceedubs Sure, happy to be of some assistance! I also agree that (4)'s the best one for inclusion (if you're only gonna do one), since State
is potentially a special case of that, and "safe by default" is my preferred mantra. :smile_cat:
I agree with @ceedubs (and @jdegoes) that (4) is the best place to start. Happy to add a type alias (and companion) for State
in terms of StateT
if it is useful, and happy to explore MonadRec
in another PR.
I think this gives us a pretty clear way to move foward. What do you think @refried?
I'll also include a related point from Gitter:
@refried:
Also @non, are you opposed to letting data depend on std's Function0 instance?
@non:
if we have to share instances in that way, i'd prefer to define them in data (non-implicitly), reference them explicitly, then create an implicit in std. (assuming std already depends on data.) does that make sense for folks? i can imagine a very small number of cases (e.g. for Function0) where we'd really like to have an instance available for some internal use. in those cases i'm fine with defining a non-implicit instance in a special location and using it explicitly.
I like @non's explicit instance idea, as long as it remains unusual. If this technique becomes commonplace to work around our dependency graph, that's evidence that we need to rethink the dependency graph.
@rossabaker: Agreed.
Thanks @ceedubs for the feedback! @non Yes. Moving forward.
But, I've run into a snag. I have an implementation of number 4 that I thought was right -- I think parametricity doesn't allow too many ways to do it anyway -- but it still blows the stack, so it's not very right. Or maybe the problem is with my freeInstance, trampolineInstance, or function0Instance. Or in Free itself?
Actually, is List.traverse[StateT[Trampoline,X],Y] even supposed to be stack-safe? or is it normal for that to blow up if the list is too big? @jdegoes Truth is I don't know what I'm doing. :sweat_smile:
@refried I'd suggest testing direct recursion, i.e. a simple function which calls itself N times. This way you can separate concerns, and such a function could be used for testing Trampoline directly, as well as for any monad transformer stacked on top of Trampoline.
@jdegoes Good idea, thanks. As a sanity check, do these results make sense:
blocked by #258
Any room in Cats for MonadState
? :-)
@adelbertc Get in line :-)
Bump; #183 #258 #284
I'm gonna put together a state
subproject that depends on free
and std
, and try to move forward from there.
If so, let me know which package (data?) and I don't mind taking a pass at it.