Same enum in multiple schemas does not get created N times

oyvindberg / typo

Typed Postgresql integration for Scala. Hopes to avoid typos

https://oyvindberg.github.io/typo/

MIT License

99 stars 9 forks source link

Same enum in multiple schemas does not get created N times #123

Closed kolemannix closed 1 month ago

kolemannix commented 1 month ago

We have an enum permission and are generating from only the public schema, filtering using selectors.

The permission schema exists in many schemas in the local DB we are generating against, since our local setup has a few different copies of our schema for different uses (you could rightly argue these should be separate databases, for sure)

The (identical) schemas are named:

application_spec
db_spec
integration_spec
public
static_query_checks

The permission enum gets created as if it belonged to the application_spec schema, even though we have the selector filtering out that schema. I assume this is because the type is first seen from the application_spec schema, and selectors aren't fully working for types.

Workaround: Clear out extra schemas before generating code, or generate from a separate database.

If we do not filter by schema, we only get 1 copy of the enum, the one from application_spec.

oyvindberg commented 1 month ago

Hmm, interesting. I'm sure there is a bug in here somewhere.

You're right that the selector is not applied to types. The way it's meant to work is that you select relations with the selector, and typo will generate code for all types it discovers from those relations.

The "discovery" part of that is a tree shaking-like thing where a set of entry points is computed

minimize(mostFiles, entryPoints = sqlFileFiles ++ keptRelations ++ domainFiles)

and it'll include all the types transitively referenced from those files. This is done by scala types.

Is is possible that you override something in Naming in order to translate things into scala names which do not include the schema? in that case there will be a name clash, which means that the minimization can pick types from any schema.

why

I should note that there is a reason why selectors don't apply to types. It would be very easy for a user to specify a selector which doesn't include all transitive dependencies. Say the table an FK points to, or a string enum in another schema. For this reason all transitive things are included in the generated code.

oyvindberg commented 1 month ago

I should note that the bare foundations for a schema-less typo are in place, but it's not finished.

sealed trait SchemaMode

object SchemaMode {
  case object MultiSchema extends SchemaMode
  case class SingleSchema(schema: String) extends SchemaMode
}

The idea is when you specify SingleSchema, typo will read only from that schema, and output code where the schema name does not appear. This way you can set the schema you want to use when creating connections, and it'll work when you use schemas as databases. It think it's a fairly common usecase, I've done so in the past as well.

This is something I've meant to handle, but it'll be later.

kolemannix commented 1 month ago

Is is possible that you override something in Naming in order to translate things into scala names which do not include the schema?

Yes, apologies, that's exactly what is happening. The goal was to remove the schema from the output sql, so that it could just use the connection schema, but removing it from the naming does not remove it from the SQL.

Short of a full SingleSchema mode, how difficult would it be to simply exclude the schema from generated SQL in the short term? This would solve our issue.

oyvindberg commented 1 month ago

It was easy to add, so I just did it in #126 . Basically a transformation of the information from the database:


  case class Input(
      tableConstraints: List[TableConstraintsViewRow],
      keyColumnUsage: List[KeyColumnUsageViewRow],
      referentialConstraints: List[ReferentialConstraintsViewRow],
      pgEnums: List[EnumsSqlRow],
      tables: List[TablesViewRow],
      columns: List[ColumnsViewRow],
      views: Map[db.RelationName, AnalyzedView],
      domains: List[DomainsSqlRow],
      columnComments: List[CommentsSqlRow],
      constraints: List[ConstraintsSqlRow],
      tableComments: List[TableCommentsSqlRow]
  ) {
    def filter(schemaMode: SchemaMode): Input = {
      schemaMode match {
        case SchemaMode.MultiSchema => this
        case SchemaMode.SingleSchema(wantedSchema) =>
          def keep(os: Option[String]): Boolean = os.contains(wantedSchema)

          Input(
            tableConstraints = tableConstraints.collect {
              case x if keep(x.tableSchema) || keep(x.constraintSchema) =>
                x.copy(tableSchema = None, constraintSchema = None)
            },
            keyColumnUsage = keyColumnUsage.collect {
              case x if keep(x.tableSchema) || keep(x.constraintSchema) =>
                x.copy(tableSchema = None, constraintSchema = None)
            },
            referentialConstraints = referentialConstraints.collect {
              case x if keep(x.constraintSchema) =>
                x.copy(constraintSchema = None)
            },
            pgEnums = pgEnums.collect { case x if keep(x.enumSchema) => x.copy(enumSchema = None) },
            tables = tables.collect { case x if keep(x.tableSchema) => x.copy(tableSchema = None) },
            columns = columns.collect {
              case x if keep(x.tableSchema) =>
                x.copy(
                  tableSchema = None,
                  characterSetSchema = None,
                  collationSchema = None,
                  domainSchema = None,
                  udtSchema = None,
                  scopeSchema = None
                )
            },
            views = views.collect { case (k, v) if keep(k.schema) => k.copy(schema = None) -> v.copy(row = v.row.copy(tableSchema = None)) },
            domains = domains.collect { case x if keep(x.schema) => x.copy(schema = None) },
            columnComments = columnComments.collect { case c if keep(c.tableSchema) => c.copy(tableSchema = None) },
            constraints = constraints.collect { case c if keep(c.tableSchema) => c.copy(tableSchema = None) },
            tableComments = tableComments.collect { case c if keep(c.schema) => c.copy(schema = None) }
          )
      }
    }
  }

oyvindberg commented 1 month ago

It won't be well tested, and I think more work needs to be done for sql files in this mode. I'll merge it anyway, and we can see if the idea looks like it works

kolemannix commented 1 month ago

Its working well in my project with the previously mentioned 5 schemas. 🎉

oyvindberg commented 1 month ago

hey, that's very cool that it worked with so little effort. we should still speed it up by reading less data, but that can be done later