Iltotore / iron

Strong type constraints for Scala
https://iltotore.github.io/iron/docs/
Apache License 2.0
457 stars 43 forks source link

Allow refinement of polymorphic types in non-inlined methods / givens #174

Closed kyri-petrou closed 1 year ago

kyri-petrou commented 1 year ago

Hi there 👋 I've looked through the sourcecode and couldn't find anything implemented for this so I thought to open an feature request issue. Apologies if this is already possible but might have missed it!

Is your feature request related to a problem? Please describe.

In many cases, we want to create codecs / typeclasses for IronTypes (json, DB access, etc.). Let's take the circe decoder as an example from the iron-circe package:

inline given [A, B](using inline d: Decoder[A], inline c: Constraint[A, B]): Decoder[A :| B] =
  d.emap(_.refineEither)

The main issue here is that this only works with inline givens or methods, because the test method in Constraint is declared inlined.

So what's wrong with using an inline given/method? The problem with that is that it will inline the code wherever a Decoder[A :| B] is needed. In this example, the effect on compilation time / code size might be minimal, but there are 2 particularly worrying cases:

  1. Someone using new Decoder{ ... } on the RHS, since that would a new anonymous class to be created wherever a Decoder[A :| B] is required
  2. When the RHS contains multiple mappings, since that creates a new Function1 each time it's inlined

Describe the solution you'd like

It would be good if there was a way to derive a RuntimeConstraint[A, B] (in lack of better name) where the test / message methods are not defined as inlined:

trait RuntimeConstraint[A, C]:
  def test(value: A): Boolean
  def message: String

This should contain the runtime logic needed to test for a polymorphic type without requiring an inlined given / method. e.g., the Decoder given above could be re-written as:

given [A, B](using d: Decoder[A], c: RuntimeConstraint[A, B]): Decoder[A :| B] =
  val err = Left(c.message)
  d.emap(v => if c.test(v) then Right(v.assume[B]) else err)

Describe alternatives you've considered

Creating an explicit given for each IronType, but that becomes cumbersome very quickly in a large codebase:

inline def makeDecoder [A, B](using inline d: Decoder[A], inline c: Constraint[A, B]): Decoder[A :| B] = ??? 

given Decoder[String, Not[Empty]] = makeDecoder

I'd be happy to contribute if needed, but I'll need some pointers on how to approach this. I actually gave it a go with trying to implement this but couldn't get it to work unfortunately

Iltotore commented 1 year ago

The RuntimeConstraint sounds like a good idea to me but it is not easily doable.

AFAIK you cannot derive a RuntimeConstraint from a Constraint because this would cause a "deferred inline" error. You'd have to implement this typeclass for each constraint which is a burdensome and redundant.

Can you share the implementations you tried?

kyri-petrou commented 1 year ago

I think that as I started typing down what I tried, I managed to somewhat accidentally figure it out? This seems like something that would work right?

final class RuntimeConstraint[A, B](_test: A => Boolean, _message: String) {
  def test(value: A): Boolean = _test(value)
  def message: String         = _message
}

object RuntimeConstraint {

  inline given derived[A, B](using c: Constraint[A, B]): RuntimeConstraint[A, B] =
    new RuntimeConstraint[A, B](c.test(_), c.message)

}
kyri-petrou commented 1 year ago

I know this might be contradictory to the original issue, but in this case I think we do want to define the test method as inline. Looking at the generated code, if we use inline def test(value: A): Boolean = _test(value), we're able to take advantage of Function1's specialization and avoid boxing / unboxing of primitives 🎉.

Note that this is not an abstract inline method we can invoke it from non-inlined methods normally

Iltotore commented 1 year ago

What the generated bytecode looks like when using a given RuntimeConstraint instead of Constraint?

kyri-petrou commented 1 year ago

It's a bit hard to explain it well, but this is the behaviour from what I understand. Let's assume we have have a constraint as Greater[0] & Less[10]

Constraint will inline the conditions at every callsite (alongside everything else in the method). I'm guessing this is also why it always needs to be called by an inlined method, otherwise it can't inline the conditions. Essentially test(someInt) translates to

someInt > 0 && someInt < 10

RuntimeConstraint will create a proxy method such as the one below, which is essentially called wherever test(...) is called.

def someRandomNameTheCompilerGives(value: Int): Boolean = 
  value > 0 && value < 10

Now, there are a few things to unpack here. First, let's look at this code:

type SingleDigit = Greater[0] & Less[10]

def foo[A, B](value: A)(using c: RuntimeConstraint[A, B]): A :| B = 
  if c.test(value) then value.assume else ???

val someInt = 2
val a: Int :| SingleDigit = foo(someInt)
val b: Int :| SingleDigit = foo(someInt)

In this case, the compiler will create 2 instances of RuntimeConstraint, which means it'll generate 2 of those someRandomNameTheCompilerGives which are identical in every way except the name. This might seem dangerous, but I think it's still better than inlining the whole contents of foo on each callsite (which would be necessary with Constraint)

Where the benefit of RuntimeConstraint really shows is when we have the implicit in scope and doesn't need to be derived (e.g., given RuntimeConstraint[Int, SingleDigit] = RuntimeConstraint.derived). In this case, only 1 of these method is generated, and is reused in both invocations and everywhere this given is in scope. I think this is better than the current behaviour since (1) we're allowing the JVM to optimise this method (especially good for non-trivial assertions) and (2) we reduce the generated code size, but unfortunately requires the user to define it. I feel like this can benefit Newtypes a fair bit.

Perhaps the answer is for the derived method in my previous comment to be a def and not a given, and make the "auto derivation" available via an import.

object RuntimeConstraint:
  inline def derived[A, B](using c: Constraint[A, B]): RuntimeConstraint[A, B] =
    new RuntimeConstraint[A, B](c.test(_), c.message)

  object auto:
    inline implicit def derivedAuto[A, B](using c: Constraint[A, B]): RuntimeConstraint[A, B] = derived

As for Newtypes, I guess we can create one within RefinedTypeOpsImpl (same as with TypeTest that was added recently)

Hope that was clear (enough) because it's generally quite hard to explain 😓

Iltotore commented 1 year ago

Don't worry: I got it :stuck_out_tongue:

If I understand the problem correctly, it is especially useful when you use the same constraint multiple times.

I also get how reusing the same constraint for newtypes is useful: we could re-use the same proxy function each time but it will probably not work with apply, atleast not in a retrocompatible way.

Here is how apply is defined:

inline def apply(value: A :| C): T = value.asInstanceOf[T]

As you can see, there is no checking of the value: it is delegated to the implicit refinement conversion:

type Temperature = Int :| Positive
object Temperature extends RefinedTypeOps[Temperature]
val x: Int :| Positive= ???
val y: Int = ???

Temperature(x) //compiles to `x`
Tempature(y) //implicit refinement at compile-time

As for other methods:

inline def either(value: A)(using constraint: Constraint[A, C]): Either[String, T] =
    Either.cond(constraint.test(value), value.asInstanceOf[T], constraint.message)

Becomes:

private val constraint = RuntimeConstraint.derived[A, C]

inline def either(value: A): Either[String, T] =
    Either.cond(constraint.test(value), value.asInstanceOf[T], constraint.message)

but this will break retrocompatibility again:

I am currently sick so maybe I'm missing something. Do you have an idea about using RuntimeConstraint in RefinedTypeOpsImpl in a retrocompatible way?

kyri-petrou commented 1 year ago

Hmm that is indeed an interesting one - I'm not sure whether there's a way that it can be done without breaking binary compatibility, but I believe something along these lines of what's below could work. The RuntimeConstraint would be created implicitly on the creation of a newtype (which happens only once per newtype if my understanding is correct), and that could then be used in the methods.

final class RuntimeConstraint[A, B](_test: A => Boolean, val message: String) {
  inline def test(value: A): Boolean = _test(value)
}

object RuntimeConstraint {
  inline given derived[A, B](using c: Constraint[A, B]): RuntimeConstraint[A, B] =
    new RuntimeConstraint[A, B](c.test(_), c.message)
}

trait RefinedTypeOpsImpl[A, C, T](using rtc: RuntimeConstraint[A, C]):
  inline def either(value: A): Either[String, T] =
    Either.cond(rtc.test(value), value.asInstanceOf[T], rtc.message)

As for apply, if my understanding is correct, that's mostly meant to be used with values known during compile-time is that right? So using Constraint instead of RuntimeConstraint in those cases should be fine

kyri-petrou commented 1 year ago

I opened a draft PR just as a POC just to demonstrate the full code