typelevel / cats

Lightweight, modular, and extensible library for functional programming.
https://typelevel.org/cats/
Other
5.2k stars 1.19k forks source link

Instances for Future #2334

Open ChristopherDavenport opened 5 years ago

ChristopherDavenport commented 5 years ago

Future is largely known to not be referentially transparent. As it starts execution immediately and caches the result.

While it may have been largely necessary for these instances to be in cats before cats-effect, with that solution coming in maturity it may be time for Future instances to be pushed to alleycats.

Imports

import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import cats.implicits._ 

Example 1 - Prints Once on the first line

val future1 : Future[Unit] = Future(println("I printed"))
val futureResult1 : Future[Unit] = (future1, future1).mapN(_ => ())

Example 2 - Prints twice on the second line

def future2: Future[Unit] = Future(println("I printed"))
val futureResult2 : Future[Unit] = (future2, future2).mapN(_ => ())

As you can see we have lost equational reasoning.

What are your thoughts on the future of Future?

djspiewak commented 5 years ago

Instances like Monad[Future] and Applicative[Future] are really really dangerous. They've caused problems in scalaz for years. They should definitely be in alleycats, if we have them at all.

johnynek commented 5 years ago

My view is that any instance that is not lawful should not be in cats proper.

In this case, you have treated Future as though it were a Sync instance, which it is not. You put in side effects, print, and had some idea in mind. I think you can come up with other similar examples with other types with call by name arguments. If users see a call by name parameter as an invitation to put a side effect we have a big problem since we often use them just for short circuiting. I argue that is the fault of the user.

I advocate that we stick to the laws and argue for instances based on that. I don’t want cats to pick on some data types, whose instances are lawful, because they have other properties we don’t like.

E.g. I don’t like that scala uses universal equality and hashing for Map by default. However, I still think Map should have a Functor instance, which is lawful.

tpolecat commented 5 years ago

I agree in principle that it's "user error", but the error is committed when using the data types as intended. Future instances are fine if you never side-effect; Try instances are fine if you never throw; cigarettes are fine if you never smoke them. I think we can identify a qualitative difference here that justifies warning labels.

As a practical matter I think this hits beginners hardest, because experienced cats users are unlikely to use these data types at all. Conversations about Future on Gitter invariably lead to "that's not going to work, sorry, have a look at IO" which makes me worries about those who don't ask. My guess is these instances are used almost always by beginners, and almost always incorrectly.

So anyway I think a speed bump is warranted and I also vote for moving these to alleycats. While I'm thinking about it we should move the alleycats README onto the website so the warnings are louder.

johnynek commented 5 years ago

They were in alleycats, we had this discussion, we moved them back. Do we keep having this discussion every year or so and flip flopping? I guess anyone can do a PR and now cite the prior view they agree with and get two cats maintainers to ship it and we can switch.

The catch is we can only flip flop on binary incompatible releases.

PS: I really disagree with the Try example even more. cats-effect IO catches in similar places, but we are arguing to use that. throw is not a function, so trying to argue what is right from first principles does not make a lot of sense to me.

johnynek commented 5 years ago

One more thing: @tpolecat , I feel like your point is true about anything less than always doing pure functional programming, which I like, but cats has not been so dogmatic in the past. For instance, even in cats-effect we recently added non-referentially-transparent methods to allocate deffered instances:

https://github.com/typelevel/cats-effect/blob/master/core/shared/src/main/scala/cats/effect/concurrent/Deferred.scala#L100

The motivation is performance and not forcing flatMaps where a user is already inside a delay/suspend operation.

I think it is hard to not be an extremist: we have to have some answer other than: "all the way, all the time", but dealing with non-pure-FP is something scala programmers I think have to live with.

Lastly, I am not arguing that you should write unsafe code, I am arguing that sometimes when writing safe code the compiler can't help you much, and you have to provide the guarantees yourself. When the compiler's guarantee come with large runtime costs (due to boxing, for instance), this problem is more acute.

tpolecat commented 5 years ago

I think it is hard to not be an extremist

My thinking is that allowing the instances is the extremist position. I want to make things safer for people who aren't 100% there yet.

djspiewak commented 5 years ago

I think the problem I have with Monad[Future], even beyond situations where people side-effect, is the type is simply not referentially transparent. The canonical example:

for {
  a <- Future(computeA)
  b <- Future(computeB)
} yield a + b

// vs

val aF = Future(computeA)
val bF = Future(computeB)

for {
  a <- aF
  b <- bF
} yield a + b

The differences here are observable even without side-effects simply because the computation time will be radically different. Normally I'm of the school of thought that doesn't consider CPU/memory/stack to be effects worth controlling, but Future is by definition a datatype which controls CPU and stack. That's its whole purpose, especially if you don't use it to capture classical side-effects.

So if we start from the premise that side-effect-free Future isn't just an alias for Identity but is instead still serving a purpose, and that purpose is managing CPU scheduling of computation, then its non-referentially-transparent nature relative to this purpose is just as problematic as it is relative to side-effect capture. In other words, Future simply isn't a Monad because expressions which use it are not functions. This is different from Try in that we can't say anything analogous to "don't use throw and everything is fine." Here, the best we could do is say "don't use Future and everything is fine", but if that's the best we can do, then why are we giving people a Monad[Future]?

It should be removed. If we really really want to have it, then it should be in alleycats.

oscar-stripe commented 5 years ago

@djspiewak I disagree with your example. If you have no side effects, those two are the same. If you are expecting runtime to be the same for all rewrites of the code, that is another property not referential transparency. Note, checking the clock is an effect, so if you are timing things, you are back to an effect space. I don't recall any time we have previously argued that substitutions must have the same runtime, if so, use of call-by-name is problematic in almost every case since it will often change the runtime of the code.

djspiewak commented 5 years ago

@oscar-stripe So your argument then is that Future = Identity? If you ignore the effect of controlling CPU scheduling, there is no measurable difference between the types.

tpolecat commented 5 years ago

I think Future = Try = Id as far as equational properties are concerned. For pure computations they are interchangeable.

johnynek commented 5 years ago

@tpolecat yeah, I agree with that take.

Note: I work often in data-engineering areas, where we use Future to get concurrency to do computations in parallel: aggregate a tree of operations, not stateful side effects in many cases. (we actually use Twitter Future, which doesn't have the same Future.apply, you have to explicitly use a FuturePool, which is a bit clearer).

djspiewak commented 5 years ago

@johnynek @tpolecat If they're all equivalent, then why have them all? Why are people wanting to use Future at all? Just use Id. Id has a Monad. If you don't need Future (because it's just Id) then clearly we don't need Monad[Future].

The point is that they're not equivalent. And critically, they're not equivalent in exactly the way that Future is also not referentially transparent. The reason people want to use Future in the first place is, as you said, to get concurrency to do computations in parallel. However, as my example showed, this is precisely the way in which Future fails as an abstraction! The lack of referential transparency shows up precisely in the concurrency and scheduling (assuming you're not being evil and capturing side-effects).

So either you think that Future = Id, in which case there's no need for Monad[Future] (just use Monad[Id]), or you recognize the distinction between the two, which is CPU/stack usage, and Monad[Future] would be unsound because Future is not referentially transparent under that rubric. Either way, it's not a monad.

A related argument is asking whether or not Parallel is actually necessary when we have Applicative. The inconsistency between ap and flatMap is irrelevant if we're ignoring side-effects and CPU/stack. Future's referential opacity is the same thing.

I think Future = Try = Id as far as equational properties are concerned.

To say this, you kind of need to define what you mean by "equal" in the context of your equational reasoning. Clearly, they are different expressions that will evaluate quite differently, so they can't be equal in every conceivable sense (unlike, say, 1 + 2 = 2 + 1, which most definitely is equal in every conceivable sense). My argument is that, for Future, any notion of equality or substitutability must take into account CPU/stack and concurrency impact, because without taking those things into account, Future as a thing is pointless.

tpolecat commented 5 years ago

Right, I mean precisely that. They are equivalent in the sense that they compute the same answers (assuming pure computations). And this is why they're useful! The fact that running time is unobservable means we know something extra, and we can use this to confidently refactor our program for better performance, knowing that it will compute the same answer. Applying the functor law to fuse map operations is a trivial example; refactoring to use Future as Oscar does is more involved but equivalent.

Anyway my dog in this fight is that in practice Future is rarely used for pure computations, which means the equivalence doesn't hold, which leads beginners astray.

As a meta observation I have never been explicitly warned to avoid simultaneous internet arguments with @djspiewak and @johnynek but it seems like I would know better by now ;-)

non commented 5 years ago

Hey folks, I thought I should chime in on this one.

First of all, I think you all have done a good job of re-framing the discussion away from the question of whether Monad[s.c.Future] is lawful (I think it is per: https://gist.github.com/non/4d1f49fe41e2f12c463ae075bf5d0f06) and instead toward the actual issue: concern that authors are going to misuse s.c.Future and a desire to indemnify ourselves by removing support for future-based instances (or demoting them to second class instances).

Personally, I don't think it's appropriate to remove (or demote) the future-based instances. If Cats doesn't work well with s.c.Future, it is more likely to hurt Cats usage than it is to reduce usage of s.c.Future in the wider world. My sense has always been that our project will succeed if it makes functional programming easier for developers, rather than trying to punish developers for making what we consider the wrong choices. For example, I'd rather help authors use a free construction for their business logic, even if they end up using an impure, future-based interpreter to evaluate it later.

Probably the main driver of FP in Scala is the power of traverse (as @tpolecat has often mentioned). I think losing the ability to traverse a List[Future[...]] would cost us one of the best "gateway drugs" for Cats.

I'm not sure we gain very much from removing future-based instances, except possibly a lower support burden. I've been pretty inactive for a long time, so it's possible I'm not appreciating how heavy that burden has been recently. But historically, I feel like this has been OK.

Anyway, this is just my 2¢. I'm interested to hear what other folks think.

djspiewak commented 5 years ago

First of all, I think you all have done a good job of re-framing the discussion away from the question of whether Monad[s.c.Future] is lawful […]

Well, in part. I believe my point was that either a) Monad[Future] is equivalent to Monad[Id], in which case it's purely redundant, or b) Monad[Future] is unlawful because code involving Future is not referentially transparent. Expanding on (b), we can't talk about the laws at all because code written with Future does not form a function, just like code that closes over vars doesn't form a function, and the laws don't say anything about things that are not functions.

[…] concern that authors are going to misuse s.c.Future and a desire to indemnify ourselves by removing support for future-based instances (or demoting them to second class instances).

I'm actually less worried about that. People are going to do what they're going to do. I want to encourage them to do the right thing by making that thing as easy as possible and discourage the wrong thing by throwing up some gentle barriers. As with players in video games, people will flow like water through whatever path is easiest. We as framework designers don't have control over all of that path (e.g. pre-existing code that depends on Future), but we can still try to course correct to the best of our ability.

At the end of the day, we defined alleycats to be the project where unlawful instances go and cats-core to be the project where lawful and useful instances go. Monad[Future] is neither, since it is either unlawful or it is lawful but useless.

Probably the main driver of FP in Scala is the power of traverse (as @tpolecat has often mentioned). I think losing the ability to traverse a List[Future[...]] would cost us one of the best "gateway drugs" for Cats.

While I agree that traverse is the answer to almost everyone's problems, I don't agree that people using Future specifically are going to come to cats to solve the problem of List[Future[A]] => Future[List[A]] precisely because there's already a function on Future, conspicuously also called traverse, which addresses this exact problem.

I'm not sure we gain very much from removing future-based instances, except possibly a lower support burden.

I'm not sure it's so much about lower support burden as it is general framework design principles. At the end of the day, we have to subjectively decide where to draw lines on what is and is not part of cats. Many of these questions don't have objective answers. There might be objective evidence which points one way or another (as in this case), but how we synthesize and weigh that evidence is entirely a subjective process. There's almost no support burden for Monad[Future], so I don't think that should play a role in the decision making one way or another.

My argument is that Monad[Future] really doesn't belong in cats on mostly subjective grounds. If someone were to propose a similarly-problematic typeclass for something that isn't Future, I strongly suspect it would be shot down. As an example, I'm strongly of the opinion (like @johnynek) that Functor[Set] is valid in the category of scala objects, and yet it was relegated out of cats-core. Monad[Future] is considerably more problematic than Functor[Set] on almost every subjective and objective basis. I'm just trying to argue for consistent application of our standards while recognizing that those standards are somewhat subjective and always in flux.

tpolecat commented 5 years ago

I agree w/Daniel re: indemnifying ourselves or reducing our support burden … that's really not the point. All of us want users to have the best experience possible, and there is some disagreement on what that looks like. I will note that my opinions are colored by spending a lot of time in the Gitter channel and seeing beginners kind of marooned on Future island, unable to use cats stuff because it just doesn't really work. Maybe a good solution would be to leave things as-is but add a doc page on the limitations of Future and how to move to IO, since we have that conversation over and over.

Anyway I think this is probably kind of talked out. I certainly understand all the positions.

arosien commented 5 years ago

Removing Monad[Future] would really bug all of my clients who I've convinced to use cats, and thus it would bug me, transitively.

arosien commented 5 years ago

@tpolecat I'd be happy to help document the issues and solutions if you can help me collect them.