oyvindberg / typo

Typed Postgresql integration for Scala. Hopes to avoid typos
https://oyvindberg.github.io/typo/
MIT License
99 stars 9 forks source link

Request: Allow for generating string enums for `text` columns #124

Open kolemannix opened 1 month ago

kolemannix commented 1 month ago

The typo generated string enums are really nice. They would eliminate our need to use enumeratum, and of course have the very fast inlined-implicit doobie definitions.

We make a lot of use of text columns in postgres which are enforced to be string enums at the application level. We find that adding (and removing) members in postgres enums is pretty difficult so enforcing at the application level but not in the DB schema is a sweet spot.

I'd love to specify that a given column is a string enum (perhaps in something akin to type overrides) and have typo generate that enum and use that type for the column.

I think this could have other applications for columns where the application enforces a narrower set of values than the database schema itself, perhaps numeric enums as well where the type is int but the values are mapped as an enum in the system.

oyvindberg commented 1 month ago

I think we should definitely do some more work here. It's low-hanging fruit with a lot of upside.

These are the three variants I see of enums:

postgres-level enums

create type myenum as enum ('a', 'b', 'c');

These are already supported. If you want to change the set of values you can do that today with a schema snapshot rewrite (see below).

fake enums

You want text or similar column(s) to appear as a closed enum type. This will be a lie, in that postgres won't be aware of it. nevertheless, it's likely more flexible than working with postgres enums. This seems to be the case you're asking for here.

This can almost be done with a schema snapshot rewrite. You can add a db.StringEnum and change the db.Type of the relevant columns. See example below

The only thing we need here is a hint to typo that it should not generate type assertions in the generated SQL. For instance when inserting a myenum mentioned above, you typically cast the value insert into t(myenum) values ('a'::myenum). These fake enums should obviously not be cast to a type postgres doesn't know about.

open enums

at $WORK we use the pattern where enums are represented as tables, and columns which uses the enums have a FK to the enum value table. This allows more flexible changes to the set of enum values, and postgres is aware of it.

This currently doesn't have any special support in typo, but I have a branch I intend to finish soon. These will look something like this, which explicitly handles that you have known values and a possibility for unknown values to appear.

/** Open enum
  *   - A
  *   - B
  *   - C
  */
sealed abstract class MyEnum(val value: String)

object MyEnum {
  def apply(str: String): MyEnum =
    ByName.getOrElse(str, Unknown(str))

  case object A extends MyEnum("A")
  case object B extends MyEnum("B")
  case object C extends MyEnum("C")
  case class Unknown(override val value: String) extends MyEnum(value)

  val All: List[MyEnum] = List(A, B, C)
  val ByName: Map[String, MyEnum] = All.map(x => (x.value, x)).toMap
}

Database schema snapshot rewrite.

Typo starts by fetching the entire database schema. Once we have that in memory, you're free to change it into anything you want before typo starts generating code. This is done through Options#rewriteDatabase. Here is an example:

Options(
  // ...
  rewriteDatabase = metadb => {
    val employed = db.RelationName(Some("myschema"), "employed")
    metadb.copy(
      enums = metadb.enums ++ List(
        db.StringEnum(employed, List("YES", "NO"))
      ),
      relations = metadb.relations.map {
        case (name, lazyTable) =>
          val rewrittenTable = lazyTable.map { case table: db.Table =>
            name -> table.copy(
              cols = table.cols.map {
                case col if col.name.value == "is_employed" =>
                  col.copy(
                    tpe = db.Type.EnumRef(employed),
                    udtName = Some(employed.value),
                  )
                case col => col
              }
            )
          case notTable => notTable
          }
          (name, lazyTable)
      }
    )
  }
)

Future

So the work to support fake enums is to add a Boolean to db.StringEnum which decides if we should add casts (SqlCast) Open enums are almost there, it's pending a refactoring I want to do first.

So what's missing then is syntax and documentation. If you have any time to invest, any of (fake enums, sketching some more high-level syntax in Options than the full database rewrite thing, then sketching docs for this) would be fantastic.

kolemannix commented 1 month ago

That's a good taxonomy. Yes at $WORK we use "fake enums" for the flexibility outlined above. Open enums are also a great choice of course especially if the data changes or the DB is a point of coordination for the company, which in practice it often does become.

I had the thought of altering the metadb to fake some enums as well but was concerned about the actual type in the jdbc layer, and about rewriting the tables to use it, which you've nicely demonstrated above. I did not realize that a text will work; we currently use doobie.postgres.pgEnumString, which by virtue of its existence I assumed did something different than StringMeta.timap.

For mapping to Options, I wonder if this is an enhancement to typeOverride; but a more 'checked' variation on it where instead of providing an arbitrary type name we provide a reference to our fake enum, which could be supplied inline or separately in Options. Typo would then of course require that that relation was actually a text, or just in general that the actual relation was compatible with the provided 'fake' type.

Very very rough sketch:

trait TypeOverride {
--- def apply(relation: db.RelationName, colName: db.ColName): Option[String]
+++ def apply(relation: db.RelationName, colName: db.ColName): Option[OverrideType]

enum OverrideType = UserPickedClassName(String), FakeStringEnum(name: String, members: List[String], ...), FakeIntEnum(...)

Might need a better name than 'fake'