getkyo / kyo

Toolkit for Scala Development
https://getkyo.io
Apache License 2.0
548 stars 45 forks source link

Do we really need layers? #203

Closed fwbrasil closed 2 months ago

fwbrasil commented 7 months ago

I've been thinking about the usability of Layer and, although it ended up an interesting and more generic solution than the original in ZIO, I'm not sure its level of abstraction works well:

Instead of having a separate API, I think we could leverage Kyo's regular pending type to express the dependency graph and then provide a macro to derive the environment automatically:

trait Service1 {
  def a: Int
}
object Service1 {
  val init: Service1 < IOs = 
    IOs(Live()))
  case class Live() extends Service1
}

trait Service2 {
  def b: Int
}
object Service2 {
  val init: Service2 < (Envs[Service1] & IOs) = 
    Envs[Service1].get.map(Live(_))
  case class Live(s1: Service1) extends Service2
}

// example computation
val io: Int < (Envs[Service1] & Envs[Service2] & IOs) = ???

// typically in a main file
val res: Int < IOs = 
  Envs.run(Service1.init, Service2.init)(io)

The init of each service would express the value it provides and its dependencies via regular pending Envs effects. The Envs.run method would need a macro to inspect the dependencies, build the graph, and provide the initialization of the services.

@jiveshungerford @kitlangton I'd love to hear your thoughts on this if you have some time!

johnhungerford commented 7 months ago

I don't have a strong opinion on this matter. In general I think that Kyo would benefit from ways of simplifying the process of handling effects (and of Envs in particular). Layer still seems to me a potentially good solution, provided it is automated. Your suggestion of simply using Kyo effects directly also makes sense to me. Doing it that way is only an adequate solution, however, provided that Envs are consistently used for dependency-like effects. If someone were to make a library with an effect A < HttpClient, where handling HttpClient simply required providing some configuration, it would be nice to be able to provide that dependency using the same mechanism as other dependencies. I guess the right approach will depend on how Kyo ends up being used.

Some thoughts about your specific points:

Bottom line is I won't cry if you decide to take it out. We're in uncharted territory here, and it might make sense to keep things simple until it's a little clearer how people actually use Kyo.

kitlangton commented 7 months ago

ZLayer is essentially just ZIO with two extra features:

  1. Referential Equality — This one is a bit weird, but it's used for memoization. This is actually pretty important if you have a graph that shares an input at different levels, which is often the case.

    a -> b \
            -> c
         a /

    You probably don't want to evaluate a multiple times. A macro could solve this, but it would be necessary to provide the full set of dependencies to the macro. This would make it difficult to factor out smaller sections of the graph, e.g., val makeB = Envs.run(makeA)(io), and then later have val makeC = Envs.run(makeB, makeA)(io). In this case, makeA would get executed multiple times.

    There might be another way around this. But one of the best parts of using ZLayers is that you can safely decompose your initialization logic, rather than having to have to constantly maintain an easily-tangled mass of of app initialization logic.

    One example of a (very nicely factored) this chore is in the PFPS shopping cart repo. Everything needs to be organized into very carefully constructed, ad hoc "layers" (another example), which tend to be quite brittle and resistant to refactoring. As I mentioned, this is a really nicely factored example—but most "real world" apps have truly stultifyingly tangled Main.scala files, which tend to churn and succumb to entropy at a rapid clip.

    ZLayers saved me from that hell, which is why I was such a fan of them 😄. We should try to model a slightly more complicated example, with multiple tiers which share initializers, to test alternatives against. I certainly think there is room for simplification, and it would be great to eliminate a concept.

  2. Multiple Outputs — This one isn't as important—as we could just write the macro to use understand tuples—but with ZLayer, both the input and the output (ZLayer[In1 & In2, E, Out1 & Out2]) are type-level maps. Not much to say beyond that 😄 The memoization thing is certainly the more desirable property here.

fwbrasil commented 7 months ago

Thank you for the thoughtful replies! To be honest, I have little experience using ZIO so please bear with me :)

Doing it that way is only an adequate solution, however, provided that Envs are consistently used for dependency-like effects. If someone were to make a library with an effect A < HttpClient, where handling HttpClient simply required providing some configuration, it would be nice to be able to provide that dependency using the same mechanism as other dependencies.

That's an interesting perspective towards the generalization of layers. For instance, if we had a Configs effect, layers should be able to support it in addition to Envs. It seems the key is constraining layers to specific effects since it'd resolve the issue with potential misuse. You raise a good point about how the choice between a sophisticated generic layer mechanism and a more restricted solution depends on how people will actually use Kyo.

You probably don't want to evaluate a multiple times. A macro could solve this, but it would be necessary to provide the full set of dependencies to the macro. This would make it difficult to factor out smaller sections of the graph, e.g., val makeB = Envs.run(makeA)(io), and then later have val makeC = Envs.run(makeB, makeA)(io). In this case, makeA would get executed multiple times.

Reading ZIO's documentation, it seems memoization is only available when the provide is global. I'd say the example with two Envs.run would be the equivalent of two local provide calls, which don't have memoization. Is my understanding correct?

I can't see why memoization is needed to avoid initializing the same service multiple times if we need to support only the global case. This is how I think your example would be in code:

object A:
    val init: A < IOs = ???
object B:
    val init: B < (Envs[A] & IOs) = ???
object C:
    val init: C < (Envs[B] & Envs[A] & IOs) = ???

val io: Int < (Envs[A] & Envs[B] & Envs[C] & IOs) = ???
val res: Int < IOs =
    Envs.run(A.init, B.init, C.init)(io)

The Envs.run macro would have all the information of the dependency graph to find an initialization order that ensures each init is called once. The expansion of the macro could be something equivalent to:

val res: Int < IOs =
    Envs.run(A.init) {
        Envs.run(B.init) {
           Envs.run(C.init) {
               io
           }
        }
    }

We could have a more sophisticated expansion to initialize services in parallel for example. The main limitation would be circular dependencies but it seems ZIO also doesn't support that.

Something interesting about this potential design is that the init methods can have arbitrary effects, which isn't possible with layers.

johnhungerford commented 7 months ago

Something interesting about this potential design is that the init methods can have arbitrary effects, which isn't possible with layers.

Layers currently do allow arbitrary effects, since they have both In and Out type parameters. A layer with an In type of Envs[X] and an Out type of IOs & Aborts[Y] would provide the dependency using both IOs and Aborts, which would be added the resulting effect.

Again I think your approach with init makes sense, and it looks to me like you have in mind a way to construct the dependency graph which should reproduce the behavior of ZLayer, which I think should be the goal. I think it sense to proceed in this way and see if it turns out a more generic approach is needed.

In addition, it occurs to me it might be preferable not to provide a more generic mechanism for effect handling, precisely because that would encourage developers to roll their own effects rather than reuse the existing ones. I think library developers should be encouraged to use A < Envs[LibService] rather than their own effect A < LibService effect that just wraps or reimplements Envs under the hood.

fwbrasil commented 7 months ago

In addition, it occurs to me it might be preferable not to provide a more generic mechanism for effect handling, precisely because that would encourage developers to roll their own effects rather than reuse the existing ones. I think library developers should be encouraged to use A < Envs[LibService] rather than their own effect A < LibService effect that just wraps or reimplements Envs under the hood.

Agree, and even if the library wants to use A < LibService, LibService could be a type alias with Envs and other effects.

@kitlangton could you check my understanding of ZIO's functionality? I wonder if I could be missing something.

kitlangton commented 7 months ago

Reading ZIO's documentation, it seems memoization is only available when the provide is global.

Ah, that's actually not the case. The memoization is built into the ZLayer machinery itself, and the macro simply automates the awful task of writing out the manual arrow-like layer combinators at compile time. Here's a rather contrived example showing that off:

package layers
import zio.*

// First, let's contrive some services.
// Each layer will log when it's initialized.

trait App:
  def run =
    ZIO.debug("Running App")

object App:
  val live = ZLayer {
    for
      _ <- ZIO.debug("INITIALIZING APP")
      _ <- ZIO.service[Users]
      _ <- ZIO.service[Posts]
    yield new App {}
  }

trait Database
object Database:
  val live = ZLayer {
    ZIO.debug("INITIALIZING DATABASE").as(new Database {})
  }

trait Users
object Users:
  val live =
    ZLayer {
      for
        _        <- ZIO.debug("INITIALIZING USERS")
        database <- ZIO.service[Database]
      yield new Users {}

    }

trait Posts
object Posts:
  val live = ZLayer {
    for
      _        <- ZIO.debug("INITIALIZING POSTS")
      database <- ZIO.service[Database]
    yield new Posts {}
  }

object ProvideExample extends ZIOAppDefault:
  def run =
    ZIO
      .serviceWithZIO[App](_.run)
      .provide(
        Database.live,
        Users.live,
        Posts.live,
        App.live
      )

  // INITIALIZING DATABASE
  // INITIALIZING USERS
  // INITIALIZING POSTS
  // INITIALIZING APP
  // Running App

// We get the same result even when we factor our the layers into separate values
object RefactoredExample extends ZIOAppDefault:
  val usersLayer = Database.live >>> Users.live
  val postsLayer = Database.live >>> Posts.live
  val appLayer   = (usersLayer ++ postsLayer) >>> App.live

  def run =
    ZIO
      .serviceWithZIO[App](_.run)
      .provide(appLayer)

  // INITIALIZING DATABASE
  // INITIALIZING USERS
  // INITIALIZING POSTS
  // INITIALIZING APP
  // Running App

// This wouldn't work if our database layer weren't a val
object DefLayerExample extends ZIOAppDefault:
  def databaseLayer = ZLayer {
    ZIO.debug("INITIALIZING DATABASE").as(new Database {})
  }

  val usersLayer = databaseLayer >>> Users.live
  val postsLayer = databaseLayer >>> Posts.live
  val appLayer   = (usersLayer ++ postsLayer) >>> App.live

  def run =
    ZIO
      .serviceWithZIO[App](_.run)
      .provide(appLayer)

  // INITIALIZING DATABASE <-- Uh! Two!
  // INITIALIZING DATABASE <-- Oh! DBs!
  // INITIALIZING POSTS
  // INITIALIZING USERS
  // INITIALIZING APP
  // Running App

So, coming back to this. While we can certainly write a macro that memoizes our effects, we can only do so for those which are are directly passed into the macro.

object A:
    val init: A < IOs = ???
object B:
    val init: B < (Envs[A] & IOs) = ???
object C:
    val init: C < (Envs[B] & Envs[A] & IOs) = ???

val io: Int < (Envs[A] & Envs[B] & Envs[C] & IOs) = ???
val res: Int < IOs =
    Envs.run(A.init, B.init, C.init)(io)

In large applications it's really helpful to be able to factor out big parts of one's internal dependency graph, like this.

  val databaseLayer = ZLayer.make[Database](AppConfig.live, Database.live)
  val usersLayer = ZLayer.make[Users](databaseLayer, Users.live)
  val postsLayer = ZLayer.make[Posts](databaseLayer, Posts.live, OtherService.live)
  val appLayer   = ZLayer.make[App](usersLayer, postsLayer, otherServiceLayer, etceteraLayer)

Of course, maybe we can come up with another, more elegant solution—ZLayers aren't perfect. But the memoization is what allows them to be composed in isolation, which is quite a nice property to have.

I do wonder how many people are accidentally initializing multiple instances of their services—as this ZLayer behavior isn't very well documented, and I'd imagine many ZIO users don't know the consequences of defining layers as defs.

fwbrasil commented 7 months ago

Thanks for the clarification! I knew I was missing something 🤦🏽 Indeed memoization is a necessity! I'm also not very convinced anymore that using the pending type itself will simplify things much and a dedicated API might be able to provide better user experience. Maybe we could try to simplify layers? An initial attempt at limiting it to Envs:

    class Layer[-Pending, +Resolved]:
        def add[R, P, R2, P2](l: Layer[P, R])(
            using
            P => Pending | P2,
            Resolved <:< R2
        ): Layer[P2, R & R2] = ???
    end Layer

    object Layers:
        // should provide memoization
        def init[P, R, S](f: R < (Envs[P] & S)): Layer[P, R] < S = ???

        def run[P, R, T, S](l: Layer[P, R])(v: T < (Envs[R] & S))(
            using @implicitNotFound("Layer still has pending elements: ${P}") ev: Any => P
        ): T < S =
            ???
    end Layers

    case class Service1()
    case class Service2(s2: Service1)

    val l2                                  = Layers.init(Envs[Service1].use(Service2(_))).pure
    val l1: Layer[Any, Service1]            = Layers.init(Service1()).pure
    val l3: Layer[Any, Service1 & Service2] = l2.add(l1)

    val a: (Service1, Service2) < (Envs[Service1] & Envs[Service2]) =
        zip(Envs[Service1].get, Envs[Service2].get)

    // wrong: should have no pending effects
    val b: (Service1, Service2) < Envs[Service2] = Layers.run(l3)(a)

I think we might be able to even be able to avoid a macro by tracing the dependencies in layer.add via tags?

fwbrasil commented 5 months ago

/bounty $500

algora-pbc[bot] commented 5 months ago

## 💎 $500 bounty • Kyo

### Steps to solve: 1. Start working: Comment /attempt #203 with your implementation plan 2. Submit work: Create a pull request including /claim #203 in the PR body to claim the bounty 3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to getkyo/kyo!

Add a bountyShare on socials

Attempt Started (GMT+0) Solution
🟢 @hearnadam May 25, 2024, 5:24:40 PM #438
🟢 @kitlangton #438
fwbrasil commented 5 months ago

A solution for this issue should include:

  1. Memoization of layer initialization
  2. Integration with Envs
  3. Safe and easy-to-use API without requiring manual vertical/horizontal composition
  4. Support other effects when initializing layer values
hearnadam commented 5 months ago

/attempt #203

@kitlangton and I plan to split this work

Algora profile Completed bounties Tech Active attempts Options
@hearnadam 2 getkyo bounties
Scala, Shell,
Rust
﹟317
Cancel attempt
algora-pbc[bot] commented 5 months ago

💡 @hearnadam and @kitlangton submitted a pull request that claims the bounty. You can visit your bounty board to reward.

algora-pbc[bot] commented 4 months ago

🎉🎈 @kitlangton has been awarded $250! 🎈🎊

algora-pbc[bot] commented 4 months ago

🎉🎈 @hearnadam has been awarded $250! 🎈🎊