typelevel / Laika

Site and E-book Generator and Customizable Text Markup Transformer for sbt, Scala and Scala.js
https://typelevel.org/Laika/
Apache License 2.0
415 stars 46 forks source link

reducing use of case classes for improved binary compatibility #482

Closed jenshalm closed 1 year ago

jenshalm commented 1 year ago

This will be the primary focus for M3.

Existing public case classes fall into one of the following four categories:

1. Case classes that can remain unchanged

These are classes which are commonly used in pattern matches in user code and/or are very unlikely to evolve.

This applies to Laika's document AST, comprising of more than 100 case classes. Most extension points are based on partial functions that match on AST nodes, so retaining their unapply is crucial. The structure of the nodes is usually very simple (2 or 3 properties in many cases) and has historically rarely evolved, so sealing these types for the 1.x lifetime seems reasonable. In a rare case of needing adaptations, a new type could be introduced instead (and the existing one deprecated), since the Span and Block base types are not sealed.

The second group of classes in this category are simple ADTs where the members are often either case objects or case classes with a single property.

2. Case classes that become abstract types with private case class implementation

PRs: #488, #490, #491, #498, #499

In this group of types the need to add properties is highly likely, most of them being public configuration API.

While the idea to use the pattern documented here seemed attractive, it does not work for Scala 2.12 (changes in Scala 3 had only been backported to 2.13) and also somewhat messes with user expectations (by providing a case class that cannot be used in pattern matches).

On the other hand, these types naturally support structural equality and the library should support this. Since there are only 26 types in this group and the number is unlikely to grow dramatically, doing a bit of manual plumbing once seems acceptable. The proposed structure for these types is as suggested by @armanbilge & @satorg:

sealed abstract class Person {
  def name: String
  def age: Int
  def withName(name: String): Person
  // ...
}

object Person {

  private final case class Impl(name: String, age: Int) extends Person {
    override def productPrefix = "Person"

    def withName(name: String): Person = copy(name = name)
    // ...
  }

  def apply(name: String, age: Int): Person = Impl(name, age)
}

Full list of types falling into this category:

(**) These types will move to a different package in the final milestone.

3. Case classes that become regular classes without supporting structural equality

PRs: #483, #484, #487

This is a small group of types which were never good candidates for being case classes in the first place. They either don't support structural equality (e.g. by mostly capturing lambdas) or they do this in a way that is brittle and expensive (e.g. DocumentTree).

Full list of types falling into this category:

4. Case classes which do not need to be public

These cases have already been addressed in M2. Since they are no longer public API they can safely remain case classes.

armanbilge commented 1 year ago

While the idea to use the pattern documented here seemed attractive, it does not work for Scala 2.12 (changes in Scala 3 had only been backported to 2.13) and also somewhat messes with user expectations (by providing a case class that cannot be used in pattern matches).

I have other reservations about that pattern as well, see https://github.com/scala/docs.scala-lang/pull/2788#issuecomment-1536619135.

jenshalm commented 1 year ago

Yes, I haven't thought of that aspect.

Let me know if you spot anything else in my proposal that might be problematic. This work deals with commonly used configuration types, so users of sbt-typelevel will be affected, too. If you think this looks okay, I would be fine with going through the boring series of PRs without reviews.

armanbilge commented 1 year ago

The proposed structure for these types is:

There is an alternative way you can do this that could save you the plumbing for strings, equality, and hashing.

Where you would normally have:

case class Foo(bar: Bar, baz: Baz)

You can replace with something like

sealed abstract class Foo {
  def bar: Bar
  def withBar(bar: Bar): Foo
  // ...
}

object Foo {
  def apply(bar: Bar, baz: Baz): Foo = FooImpl(bar, baz)
}

private final case class FooImpl(bar: Bar, baz: Baz) extends Foo {
  def productPrefix = "Foo"
  def withBar(bar: Bar) = copy(bar = bar)
  // ...
}

That way the case class is still doing all the hard work, it's just hidden from the user.

jenshalm commented 1 year ago

Oh, that looks neat. Does any Typelevel project use that pattern? If there are no downsides to this approach, I think I would much prefer it over my initial idea.

armanbilge commented 1 year ago

Good question, not sure. I learned this pattern from @satorg in https://github.com/http4s/http4s/pull/6544#issuecomment-1297647842.

Edit: FWIW I'm not aware of any downsides, this would be my recommended approach. If I ever get re-motivated to do this de-case-classification in e.g. Feral then I would do it like this.

jenshalm commented 1 year ago

Good to know, thanks a lot for these pointers. Seems like it would save a lot of plumbing. I'll change the ticket accordingly tomorrow.

armanbilge commented 1 year ago

This applies to Laika's document AST, comprising of more than 100 case classes. Most extension points are based on partial functions that match on AST nodes, so retaining their unapply is crucial. The structure of the nodes is usually very simple (2 or 3 properties in many cases) and has historically rarely evolved, so sealing these types for the 1.x lifetime seems reasonable. In a rare case of needing adaptations, a new type could be introduced instead

If you're confident about lack of evolution, then this is perfectly reasonable.

If you really wanted to double-down on compatibility for these classes as well, you could use a similar technique described above, and manually add unapply methods to the companion objects. ScalaMeta recently introduced a new pattern to support evolution of unapply.


object Foo {
  object Initial {
    def unapply(foo: Foo): Option[(Bar, Baz)] = ???
  }

  object After_1_1_0 {
    def unapply(foo: Foo): Option[(Bar, Baz, Qux)] = ???
  }
}
satorg commented 1 year ago

@jenshalm

If there are no downsides to this approach, I think I would much prefer it over my initial idea.

Apparently, there could be a downside if some one relies on Java reflection for such a class. E.g.:

assert(Foo(...).getClass.getName.endsWith("Foo"))

can provide weird results in that case. However, I believe that making such assumptions can be considered as an anti-pattern in Scala.

satorg commented 1 year ago

@armanbilge

ScalaMeta recently introduced a new pattern to support evolution of unapply.

Oh, that's interesting. How exactly is it supported by ScalaMeta, could you suggest a link please? Thx!

armanbilge commented 1 year ago

Oh, that's interesting. How exactly is it supported by ScalaMeta, could you suggest a link please? Thx!

@satorg https://scalameta.org/docs/trees/guide.html#with-versioned-constructor-like-matchers

jenshalm commented 1 year ago

@satorg I think I'm fine with breaking Java reflection. This also feels somewhat unlikely to be a common pattern for users dealing with simple configuration types.

@armanbilge thanks for the pointer for evolution of unapply. I think for the AST I'd prefer to stick with sealing the APIs. Past experience shows this should work, many types are unchanged from version 0.1 in 2012!. There is also the mentioned option to introduce new AST nodes if needed.

Are the small changes you introduced compared to the original example deliberate (trait vs. abstract class and Impl nested in companion vs. being a top level type)? I assume both variants should work the same way?

armanbilge commented 1 year ago

Are the small changes you introduced compared to the original example deliberate (trait vs. abstract class and Impl nested in companion vs. being a top level type)? I assume both variants should work the same way?

Yeah, those are just issues of taste/style, shouldn't be significant in practice. Abstract classes have a slight performance edge and probably the nested Impl is better from a private scoping perspective.

jenshalm commented 1 year ago

Completed in 8 PRs which are linked in the description above.