typelevel / cats

Lightweight, modular, and extensible library for functional programming.
https://typelevel.org/cats/
Other
5.26k stars 1.21k forks source link

is there a place in cats for the State monad? #122

Closed aryairani closed 9 years ago

aryairani commented 9 years ago

If so, let me know which package (data?) and I don't mind taking a pass at it.

stew commented 9 years ago

yeah, data is the place, go for it!

aryairani commented 9 years ago

Great, I started — how deep should I go? State and/or StateT and/or IndexedStateT and/or MonadState?

So far I've done a little of each, but I guess I shouldn't waste my time on say, State alone, unless it's desired.

I guess IndexedStateT is very nice because it's very general, but on the other hand, it's harder for newcomers to understand, compared to State. But if there are performance or type-inference benefits to having explicit versions of all three, I can do that too.

P.S. I'm not super confident with MonadState yet.

aryairani commented 9 years ago

If I don't hear back, I'll finish fleshing out State and StateT and later we can look at the PR and decide.

non commented 9 years ago

Yeah, I don't totally know the answer. I guess I would aim for the smallest consistent piece of work, to minimize the amount of possible refactoring you/we may need to do. Since the project is in new and in flux we are trying to maximize throughput and minimize latency rather than minimizing commits/work. :)

aryairani commented 9 years ago

nod

jdegoes commented 9 years ago

Pretty please implement StateT as F[S => F[(S, A)]] and not S => F[(S, A)]! The additional ugliness can be hidden from view so users don't have to see it.

Scalaz' state monads cannot be trampolined effectively because recursive application of the state function is not guarded by the base monad.

aryairani commented 9 years ago

I don't have an opinion on this, and am happy to do it either way, although I just finished with S => (S,A) and S => F[(S,A)]. I do want to be able to trampoline with State; you say this will allow it? :) Could more folks weigh in?

P.S. The State examples I wrote are blocked by #140

aryairani commented 9 years ago

@jdegoes Should they be two separate things?

jdegoes commented 9 years ago

@refried I would prefer if they are not two separate things, because you can expose the exact same interface on top of the "more powerful" state and hide the slightly more complex internal representation.

The reason this representation can be trampolined is because in order to pass a value to the function, you must first lift it into the monad, which in a trampoline becomes an ADT interpreted later. So you cannot actually apply the state function without first yielding to the interpreter.

aryairani commented 9 years ago

@jdegoes So to the question of whether or not they should be two separate things, it's the same argument as whether State and StateT should be two separate things, right? (DRY vs type inference problems if applicable)

aryairani commented 9 years ago

Also, correct me if I'm wrong, but it looks like most StateT1 methods only require F:Functor, while StateT2 operations require F:FlatMap. If true, I'm not sure whether this is something anyone cares about.

jdegoes commented 9 years ago

@refried Yes, I guess it's about the same argument, as the more trampoline-able representation of State is also a bit slower. That said, the number of programs which can operate in the state monad without recursion is probably very small, i.e. almost everyone would be using the trampolined version.

aryairani commented 9 years ago

I'll work on getting them all ready.

aryairani commented 9 years ago

Ok now examples are blocked by absence of examples project; @stew please advise :)

aryairani commented 9 years ago

@jdegoes / @mpilquist / @non / whoever https://github.com/refried/cats/compare/wip/state-impl

I know there's duplicated functionality; I can delegate to one implementation if desired, or rename things.

aryairani commented 9 years ago

btw, I don't like the names eval and exec because I don't know how their meanings are different.

aryairani commented 9 years ago

@jdegoes Would you be willing to write up something for dummies (me) on trampolining State? And if so, could you describe how the situation would be different if we had native TCO?

mpilquist commented 9 years ago

@refried Here's the start of an implementation of State in cats based on Runar's free monad paper: https://gist.github.com/mpilquist/eec3764fc287da70e269

We could extend this to StateT and define State as an alias for StateT[Id, S, A]. I'd much prefer trampolining to be built in to StateT than having to manually stack on Free every time I use it.

stew commented 9 years ago

@refried examples are now in the doc sub-project, look at docs/src/main/tut, the "tut" task typechecks them, the makeSite task generates the jekyll website in docs/target/site, you can run jekyll locally to see the results by changing to the docs/target/site directory and running jekyll serve

jdegoes commented 9 years ago

Unfortunately, trampolining can't be baked into any transformer, it has to be part of the base monad. It can be baked into State (which is probably a good idea if you subscribe to "safe by default" design philosophy), but not StateT. Further, any implementation that takes the representation S => F[(S, A)] isn't stack-safe under recursive application of the state function (you can look at Scalaz StateT to see a number of these cases).

Finally, I will add a write-up on trampolining state to my todo list, which unfortunately makes it the 3rd such todo (keep bugging me :)). Native TCO isn't much use for monadic tail recursion, although purescript-tailrec has taken a very interesting approach I'd like to see replicated or expanded on in Cats: the idea of a monad which is "tail recursive" -- i.e. for which you can express recursive computation as a monadic unfolding on some initial value.

Recursive monadic computations expressed this way can have an impure (or TCO) interpreter which can run forever in a stack safe way, with much higher performance than trampolines.

aryairani commented 9 years ago

What are some common base monads used with StateT in practice?

wedens commented 9 years ago

@refried I've used IO

aryairani commented 9 years ago

@wedens @jdegoes if IO is trampoliney (and it is, right?), then is there any problem? On Mon, Feb 9, 2015 at 03:50 wedens notifications@github.com wrote:

@refried https://github.com/refried I've used IO

— Reply to this email directly or view it on GitHub https://github.com/non/cats/issues/122#issuecomment-73474835.

mpilquist commented 9 years ago

@refried I often use EitherT[StateT[Future, S, ?], E, ?]

jdegoes commented 9 years ago

Future, IO, Task, and Free are common base monads (which are all trampolined and stack safe with the alternate formulation of StateT suggested above), and also monad transformers based on these (for example, stacking StateT on ErrorT on Task).

aryairani commented 9 years ago

Ok. So we could do one of:

aryairani commented 9 years ago

@jdegoes Ok, so MonadRec only helps for recursive tail calls, but trampolining helps for general tail calls, right?

jdegoes commented 9 years ago

@refried Trampolining is useful not just for tail calls, but any kind of recursion, even mutual recursion. Whereas MonadRec is only useful for tail call self-recursion.

aryairani commented 9 years ago

@jdegoes I thought that's what I said :-)

MonadRec seems less useful (or not useful?) in Scala compared to PureScript, because if we use for-comprehensions, we always have that yield/map at the end. Is that right?

Edit 2: (I'm fiddling with a port of MonadRec and 4 different State implementations, but I don't have too much experience with this stuff, and struggling to even put a stack-blowing example together.)

jdegoes commented 9 years ago

@refried Sorry, the part I was emphasizing is that trampolining is magical even for non-tail calls (of any kind). But yeah, same as what you're saying.

MonadRec is still useful as long as you don't perform the recursion inside the for-comprehension. The recursion is made explicit (it's really an unfold) in the return value of the function, which indicates to the parent loop whether or not to continue the "recursion".

aryairani commented 9 years ago

@jdegoes Thanks.

Right, so the function either returns loop(args for recursion) or done(result) I am starting to see that MonadRec looks like StateT!

aryairani commented 9 years ago

@jdegoes I'd like to eventually understand the uses, strengths, and weaknesses of the different models, and get something nice going with the goal of being able to write fast, scalable FP. I made some headway tonight (wip branch), but it's still a jumble.

jdegoes commented 9 years ago

If you have stack-safe recursion or no recursion, then it makes sense to use the non-trampolined monads. OTOH, in many cases, the depth of the recursion is determined by values only available at runtime, so it's hard to guarantee stack-safe recursion for many use cases.

The non-trampolined monads will be the fastest, but also the least safe.

If you don't have stack safe recursion, but it's self-recursion and you can model it as an unfold, then it makes sense to use MonadRec. It should be as fast as the above, but completely safe. However, sometimes it's a bit more challenging to express recursion in this way.

Finally, if you don't have stack-safe recursion, and it's either not self-recursion or you don't want to model it as an unfold, then use a trampoline. The trampoline will be very slow.

aryairani commented 9 years ago

Sorry I've let this sit for a while.

We've got these 4-5 options; I wanted to collect some benchmarks before saying that 1-3 can be replaced by number 4.

  1. Non-trampolined, Id-based State: S => (S,A)
    • pro: simple, fast
    • cons: stack overflow (I've hit it.); special-case of StateT (DRY)
  2. Trampolined, Id-based State:
    • pro: stack-safe
    • cons: slow; special-case of StateT? (DRY)
  3. scalaz-style StateT: S => F[(S,A)] impossible to trampoline, per @jdegoes
  4. Trampolinable StateT: F[S => F(S,A)]]
    • pro: is stack-safe if F is.
    • cons: slow
  5. MonadRec unfold
    • pros: tailrec
    • cons: is a different kind of thing and probably doesn't belong in this list

I have the feeling that my benchmark is needlessly complicated though. If someone has a good use case to benchmark instead, I'd be happy to swap it in and run them and report back.

ceedubs commented 9 years ago

I think we will at least want number 4: StateT: F[S => F(S,A)]].

Some people may want number 1 for specific use cases, but I think generally this sort of thing is a ticking time bomb in a code base. I say we come back to this if someone really wants it.

Number 2 is a special case of number 4.

Number 5 sounds interesting and I should look into it more. We may want this in addition to number 4. However, I think we need something that's useful in the general case of recursion (not just self-recusion).

@jdegoes thank you for all of the great input. I learned a lot from reading your comments.

@refried thanks for all of your work on this. I'm really looking forward to having State, because I keep finding myself wanting to use it in tests and examples :).

jdegoes commented 9 years ago

@ceedubs Sure, happy to be of some assistance! I also agree that (4)'s the best one for inclusion (if you're only gonna do one), since State is potentially a special case of that, and "safe by default" is my preferred mantra. :smile_cat:

non commented 9 years ago

I agree with @ceedubs (and @jdegoes) that (4) is the best place to start. Happy to add a type alias (and companion) for State in terms of StateT if it is useful, and happy to explore MonadRec in another PR.

I think this gives us a pretty clear way to move foward. What do you think @refried?

non commented 9 years ago

I'll also include a related point from Gitter:

@refried:

Also @non, are you opposed to letting data depend on std's Function0 instance?

@non:

if we have to share instances in that way, i'd prefer to define them in data (non-implicitly), reference them explicitly, then create an implicit in std. (assuming std already depends on data.) does that make sense for folks? i can imagine a very small number of cases (e.g. for Function0) where we'd really like to have an instance available for some internal use. in those cases i'm fine with defining a non-implicit instance in a special location and using it explicitly.

rossabaker commented 9 years ago

I like @non's explicit instance idea, as long as it remains unusual. If this technique becomes commonplace to work around our dependency graph, that's evidence that we need to rethink the dependency graph.

non commented 9 years ago

@rossabaker: Agreed.

aryairani commented 9 years ago

Thanks @ceedubs for the feedback! @non Yes. Moving forward.

But, I've run into a snag. I have an implementation of number 4 that I thought was right -- I think parametricity doesn't allow too many ways to do it anyway -- but it still blows the stack, so it's not very right. Or maybe the problem is with my freeInstance, trampolineInstance, or function0Instance. Or in Free itself?

aryairani commented 9 years ago

Actually, is List.traverse[StateT[Trampoline,X],Y] even supposed to be stack-safe? or is it normal for that to blow up if the list is too big? @jdegoes Truth is I don't know what I'm doing. :sweat_smile:

jdegoes commented 9 years ago

@refried I'd suggest testing direct recursion, i.e. a simple function which calls itself N times. This way you can separate concerns, and such a function could be used for testing Trampoline directly, as well as for any monad transformer stacked on top of Trampoline.

aryairani commented 9 years ago

@jdegoes Good idea, thanks. As a sanity check, do these results make sense:

aryairani commented 9 years ago

blocked by #258

adelbertc commented 9 years ago

Any room in Cats for MonadState? :-)

aryairani commented 9 years ago

@adelbertc Get in line :-)

aryairani commented 9 years ago

Bump; #183 #258 #284

aryairani commented 9 years ago

I'm gonna put together a state subproject that depends on free and std, and try to move forward from there.