PostgreSQL Range Types - Githubissues

Hollo community, I would like to introduce my vision for PostgreSQL range types implementation in doobie since it was requested in this issue but not implemented yet https://github.com/tpolecat/doobie/issues/936

I hope together we can improve this implementation and release the Range Types support into life

Range types

The following range types are supported, and map to doobie generic class:

the int4range schema type maps to Range[Int]
the int8range schema type maps to Range[Long]
the numrange schema type maps to Range[Float] | Range[Double] | Range[BigDecimal]
the daterange schema type maps to Range[java.time.LocalDate]
the tsrange schema type maps to Range[java.time.LocalDateTime]
the tstzrange schema type maps to Range[java.time.OffsetDateTime]

case class Range[T](lowerBound: Option[T], upperBound: Option[T], edge: Edge)

To control the inclusive and exclusive bounds according to the PostgreSQL specification you need to use a special Edge enumeration when creating a Range:

object Edge {
  case object `(_,_)` extends Edge
  case object `(_,_]` extends Edge
  case object `[_,_)` extends Edge
  case object `[_,_]` extends Edge
  case object `empty` extends Edge
}

In the text form of a range, an inclusive lower bound is represented by '[' while an exclusive lower bound is represented by '('. Likewise, an inclusive upper bound is represented by ']', while an exclusive upper bound is represented by ')'.

Note: the range types mappings are defined in a different object (rangeimplicits). To enable it you must import them explicitly:

import doobie.postgres.rangeimplicits._

To create for example custom implementation of Range[Byte] you can use the public method which declared in the following package doobie.postgres.rangeimplicits:

def rangeMeta[T](sqlRangeType: String)(implicit D: RangeBoundDecoder[T], E: RangeBoundEncoder[T]): Meta[Range[T]]

The main requirements is to need to implement the RangeBoundDecoder[Byte] end RangeBoundEncoder[Byte] which are essentially:

type RangeBoundDecoder[T] = String => T
type RangeBoundEncoder[T] = T => String

For a Range[Byte], the meta and bounds encoder and decoder would appear as follows:

import doobie.postgres.rangeimplicits._
import doobie.postgres.types.Range.{RangeBoundDecoder, RangeBoundEncoder}

implicit val byteBoundEncoder: RangeBoundEncoder[Byte] = _.toString
implicit val byteBoundDecoder: RangeBoundDecoder[Byte] = _.toByte

implicit val byteRangeMeta: Meta[Range[Byte]] = rangeMeta[Byte]("int4range")

...
val int4rangeWithByteBoundsQuery = sql"select '[-128, 127)'::int4range".query[Range[Byte]]

Hello @tomohavvk, thank you for your thorough work! I believe the support for range types will be a fantastic addition to Doobie!

However, I would like to discuss some potential pitfalls I've found out in the PR (if I may).

1. Unintentional implicit conversions

Typedefs RangeBoundEncoder and RangeBoundDecoder followed by implicit function values of those types are nothing but disguised implicit conversions. Therefore if a user imports those implicits via import doobie.postgres.rangeimplicits._, the following behavior gets enabled automatically in the scope (scala 2.13.13 repl):

scala> type RangeBoundEncoder[T] = T => String
type RangeBoundEncoder

scala> implicit val intBoundEncoder: RangeBoundEncoder[Int] = _.toString
val intBoundEncoder: RangeBoundEncoder[Int] = $Lambda/0x000000012853cc00@3caa4d85

scala> def foo(str: String) = println(s"supposed to be a string: $str")
def foo(str: String): Unit

scala> foo(123) // <-- passing an `Int` into a function that takes `String` !
supposed to be a string: 123

I.e. all the provided conversions to and from String become implicit, which is definitely not what most of users mean.

Moreover, those implicits are not really necessary out there, compare please:

version with implicits:

type RangeBoundEncoder[T] = T => String
type RangeBoundEncoder[T] = T => String
implicit val intBoundEncoder: RangeBoundEncoder[Int]     = _.toString
implicit val longBoundEncoder: RangeBoundEncoder[Long]   = _.toString
implicit val floatBoundEncoder: RangeBoundEncoder[Float] = _.toString
implicit val intBoundDecoder: RangeBoundDecoder[Int]     = java.lang.Integer.valueOf(_)
implicit val longBoundDecoder: RangeBoundDecoder[Long]   = java.lang.Long.valueOf(_)
implicit val floatBoundDecoder: RangeBoundDecoder[Float] = java.lang.Float.valueOf(_)

implicit val intRangeMeta: Meta[Range[Int]]     = rangeMeta("int4range")
implicit val longRangeMeta: Meta[Range[Long]]   = rangeMeta("int8range")
implicit val floatRangeMeta: Meta[Range[Float]] = rangeMeta("numrange")

versus a more straightforward version without conversion implicits:

implicit val intRangeMeta: Meta[Range[Int]]     = rangeMeta("int4range")(_.toString, java.lang.Integer.valueOf)
implicit val longRangeMeta: Meta[Range[Long]]   = rangeMeta("int8range")(_.toString, java.lang.Long.valueOf)
implicit val floatRangeMeta: Meta[Range[Float]] = rangeMeta("numrange")(_.toString, java.lang.Float.valueOf)

In other words, the implicit conversions there do not seem simplifying anything (nor improve clarity IMO) and in fact require even more lines of code to put everything together eventually.

2. Inexact type correspondence

Speaking of the numrange type:

implicit val floatRangeMeta: Meta[Range[Float]]                   = rangeMeta("numrange")
implicit val doubleRangeMeta: Meta[Range[Double]]                 = rangeMeta("numrange")
implicit val bigDecimalRangeMeta: Meta[Range[BigDecimal]]         = rangeMeta("numrange")

AFAIK, numrange is a built-in range type for the numeric scalar. So yes, the best corresponding JVM type is BigDecimal indeed. However, providing Meta for ranges of Double and Float by default might not be very good idea – a user may want to create a custom range type for floating-point numbers and then they can run into ambiguous implicit problem in that case (from here):

CREATE TYPE floatrange AS RANGE (subtype = float8, ...);

Then somewhere in user code:

implicit val doubleRangeMeta: Meta[Range[Double]] = rangeMeta("floatrange")

So I would probably suggest to hold on creating too many type shortcuts from the beginning. Ultimately, if there is demand for that from users, it will be easier to add them later (comparing to deprecating and removing stuff in a case of complaints).

3. Over-backticking

It is solely my personal perceptions, but does using a lot of back-ticked identifiers in the code really improves it? I feel it may make code more obscure instead. Especially because not all users like to see ASCII-art in their code. For example, (_,_) – this one might look a bit obscene, don't you think?

I feel that more straightforward and less fancy naming could be more appreciated, e.g.:

object Edge {
  case object InclIncl extends Edge
  case object InclExcl extends Edge
  case object ExclIncl extends Edge
  case object ExclExcl extends Edge
  case object Empty extends Edge
}

Also, the suggested approach allows to construct a Range like this:

Range(Some(1), Some(2), Edge.empty)

Which I am not sure about if it makes a lot of sense. In other words, modeling of the Range type may deserve additional thought.

Hello @satorg, thank you very much for the constructive comments, which have truly improved the implementation. I supported each of your suggestions, and here's what we have at the moment:

1. Unintentional implicit conversions

I completely removed any mentions of RangeBoundEncoder and RangeBoundDecoder. Now, there are no implicits; everything needs to be passed explicitly:

def rangeMeta[T](sqlRangeType: String)(encode: T => String, decode: String => T): Meta[Range[T]]

2. Inexact type correspondence

Done. numrange only maps to Range[BigDecimal]. We can easily extend it with more numeric types in the future(if needed)

3. Over-backticking

It is solely my personal perceptions, but does using a lot of back-ticked identifiers in the code really improves it? I feel it may make code more obscure instead. Especially because not all users like to see ASCII-art in their code. For example, (,) – this one might look a bit obscene, don't you think?

I thought about it :) but it didn't stop me because in my opinion it was the easiest way to understand, when you actually look at those brackets that are in PostgreSQL and you don't need to thinking about one more mapping, however, I agree that such code is rare and can be annoying. Therefore, I gladly rewrote it and now no one will see (_,_). Now it looks like:

object Edge {
  case object ExclExcl extends Edge
  case object ExclIncl extends Edge
  case object InclExcl extends Edge
  case object InclIncl extends Edge
}

4. Empty range

Also, the suggested approach allows to construct a Range like this: Range(Some(1), Some(2), Edge.empty) Which I am not sure about if it makes a lot of sense. In other words, modeling of the Range type may deserve additional thought.

I redesigned the range type with this in mind, now the range hierarchy looks like this:

sealed trait Range[+T]
case object EmptyRange extends Range[Nothing]
case class NonEmptyRange[T](lowerBound: Option[T], upperBound: Option[T], edge: Edge) extends Range[T]

We can still create a range like: NonEmptyRange[Int](None, None, ExclExcl) which can surprise at first glance, but this is a valid case from PostgreSQL's point of view and we need to support it:

    nr     | isempty | lower | upper 
-----------+---------+-------+-------
 (,)       | f       |       |      
 [3,)      | f       |     3 |      
 (,5)      | f       |       |     5
 [1.1,2.2) | f       |   1.1 |   2.2
 empty     | t       |       |      
 [1.7,1.7] | f       |   1.7 |   1.7

If there are any suggestions with improvements, I will be glad to consider them.

Thanks for the fantastic PR description @tomohavvk and your review @satorg. I'll try and get this merged soon and be part of 1.0-RC6

tpolecat / doobie

PostgreSQL Range Types #2008

Range types

In the text form of a range, an inclusive lower bound is represented by '[' while an exclusive lower bound is represented by '('. Likewise, an inclusive upper bound is represented by ']', while an exclusive upper bound is represented by ')'.

1. Unintentional implicit conversions

2. Inexact type correspondence

3. Over-backticking

1. Unintentional implicit conversions

2. Inexact type correspondence

3. Over-backticking

4. Empty range