softwaremill / sttp-apispec

OpenAPI, AsyncAPI and JSON Schema Scala models.
Apache License 2.0
23 stars 10 forks source link

SchemaComparator does not handle references to `$defs` #172

Open seveneves opened 2 weeks ago

seveneves commented 2 weeks ago

I am using SchemaComparator to check that there are no compatibility issues between two version of the same json schema, which is decoded as apispec.Schema.

Following program highlights the problem

object RefsProblem {

  import sttp.apispec.validation.SchemaComparator
  import sttp.tapir
  import sttp.tapir.docs.apispec.schema.TapirSchemaToJsonSchema

  object v1 {
    case class Child(name: String) derives tapir.Schema
    case class Parent(name: String, children: Seq[Child]) derives tapir.Schema
  }
  object v2 {
    case class Child(name: String, age: Int) derives tapir.Schema
    case class Parent(name: String, children: Seq[Child]) derives tapir.Schema
  }

  private def apiSpecSchema[T: tapir.Schema] = TapirSchemaToJsonSchema(
    implicitly[tapir.Schema[T]],
    markOptionsAsNullable = true,
  )

  private def defs(schema: apispec.Schema): Map[String, apispec.Schema] = schema.$defs.getOrElse(Map.empty)
    .collect { case (k, s: apispec.Schema) => k -> s }

  def main(args: Array[String]): Unit = {
    val writer = apiSpecSchema[v1.Parent]
    val reader = apiSpecSchema[v2.Parent]
    val comparator = new SchemaComparator(defs(writer), defs(reader))
    comparator.compare(
      writer.copy($schema = None, $defs = None),
      reader.copy($schema = None, $defs = None),
    ).map(_.toString).foreach(println)
  }
}

And to avoid the problem specified in #170, I reset $schema and $defs and pass $defs as arguments to new SchemaComparator.

Such comparison does not identify new field age: Int in v2.Child schema, which it should detect.

The reason for this is that there is no $ref lookup implemented for #/$defs/ and it only supports #/components/schemas/.

After playing around with code of SchemaComparator and adding DefsRef along with LocalRef, I was able to make it work and produce the desired validation check

private object DefsRef {
  private val DefPrefix = "#/$defs/"
  def unapply(ref: String): Option[String] =
    Option.when(ref.startsWith(DefPrefix))(ref.stripPrefix(DefPrefix))
}

incompatible schema for property children:

  • incompatible schema for items:
    • target schema requires properties: age

I can submit a PR to fix this and #170 but I need a reaction from the maintainer to validate both issues.

adamw commented 2 weeks ago

Sure this sounds like something we should support, do you maybe have some pointer as to when $defs should be used, comparing to #/components/schemas/? So that we follow the spec

Also cc @ghik

ghik commented 2 weeks ago

Yes, the comparator is limited in many ways and generally optimized for schemas generated from tapir endpoints. $def is not understood at all. More details in the PR: https://github.com/softwaremill/sttp-apispec/pull/157

seveneves commented 2 weeks ago

@adamw it is a good question.

Looking at the latest specification of json schema, $ref can have multiple ways of referring to a schema, the one which is missing is called Schema with a JSON Pointer and the format is { "$ref": "#/$defs/string" }.

Such reference is defined here https://github.com/softwaremill/tapir/blob/91a95b77009ed887fea9407eb8f8f6948e8984c6/docs/apispec-docs/src/main/scala/sttp/tapir/docs/apispec/schema/TapirSchemaToJsonSchema.scala#L20

seveneves commented 2 weeks ago

$def is not understood at all

@ghik This is what I also noticed :)

But it is relatively easy to support it. If it is done then TapirSchemaToJsonSchema can be used to generate apispec.Schema and such schemas could be used in SchemaComparator.

I have it working on my machine without major changes to the comparator but I had to refactor the resolution/normatlization logic

adamw commented 1 week ago

If it's a simple change - why not, but I suspect there must be some difference between references using $ref + $def and #/components/schemas/?

adamw commented 1 week ago

That is, when should one be used, and when the other?

ghik commented 1 week ago

@adamw I believe $defs is a feature of JSON Schema (in isolation), while #/components/schemas is an OpenAPI specific thing. For this reason I assumed that there shouldn't be much reason to use $defs in schemas generated from tapir endpoints, as they can use the components object as the library for reusable schemas.

ghik commented 1 week ago

Oh, it seems my assumption was somewhat subconscious and, in fact, not a right one, looking at this code which uses $defs.

IMO it would still be more natural to use #/components/schemas for that, but I might be missing some context.

adamw commented 1 week ago

@ghik I think you're right, $defs allows you to reuse schemas within a single JSON schema, that is in the context of OpenAPI in within a single body definition, for example. While components/schemas allows you to reuse schemas across different json schemas, for different bodies.

In the tapir schema -> json schema conversion, we cannot use the shared pool (as it doesn't exist in that context), so we're using $defs.

Anyway, seems that supporting $defs in the same way as supporting components/schemas seems the way to go, so @seveneves if you'd like to contribute that change, that would be great :)