omissis / go-jsonschema

A tool to generate Go data types from JSON Schema definitions.
MIT License
591 stars 95 forks source link

Handling duplicate types #100

Open glvr182 opened 1 year ago

glvr182 commented 1 year ago

I have multiple types referencing to the another type. Currently the generator will just increment the counter after the type (say "example_1").

Is there a way to disable this behaviour. Maybe using a config option?

omissis commented 1 year ago

Hi @glvr182 ! There are several reasons for that, but mostly it boils down to avoiding type naming collisions that would cause invalid code to be generated, as you can't have two types with the same name within the same package, so I am afraid there is no flag to disable that. Now, having encountered that behaviour myself, I started working on a mitigation within this PR: it started as an effort to implement subschemas (mainly anyOf and allOf), but it also contains some fixes and changes, among which there are some mitigation for the duplicated types. It doesn't solve all of the cases, but it should make that behavior less widespread. If you're willing to give it a try, clone the repo, checkout the feature/implement-subschemas, build the binary using go build -o /usr/local/bin/go-jsonschema cmd/gojsonschema/main.go and let me know how it goes :)

glvr182 commented 1 year ago

Understood. I actually also had the problem where a type was declared twice in the same file. I patched it locally by just completely removing the uniquename check but this should not be how to handle such cases. I shall post an example of this in a followup message in this issue.

omissis commented 1 year ago

awesome, thanks!

glvr182 commented 1 year ago

@omissis Okay so i have the following schema energy-measurement-unit.json:

{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "http://www.openchargealliance.org/schemas/oscp/2.0/energy-measurement-unit.json",
    "title": "EnergyMeasurementUnit",
    "uniqueItems": true,
    "type": "string",
    "enum": [
        "WH",
        "KWH"
    ]
}

and then a file referencing this: energy-measurement.json

{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "http://www.openchargealliance.org/schemas/oscp/2.0/energy-measurement.json",
    "title": "EnergyMeasurement",
    "type": "object",
    "properties": {
        "value": {
            "type": "number"
        },
        "phase": {
            "$ref": "phase-indicator.json"
        },
        "unit": {
            "$ref": "energy-measurement-unit.json"
        },
        "direction": {
            "$ref": "energy-flow-direction.json"
        },
        "energy_type": {
            "$ref": "energy-type.json"
        },
        "measure_time": {
            "type": "string",
            "format": "date-time"
        },
        "initial_measure_time": {
            "type": "string",
            "format": "date-time"
        }
    },
    "required": [
        "value",
        "phase",
        "unit",
        "direction",
        "measure_time"
    ]
}

resulting in the following:

type EnergyMeasurementUnit string

type EnergyMeasurementUnit string

const EnergyMeasurementUnitKWH EnergyMeasurementUnit = "KWH"
const EnergyMeasurementUnitKWH EnergyMeasurementUnit = "KWH"
const EnergyMeasurementUnitWH EnergyMeasurementUnit = "WH"
const EnergyMeasurementUnitWH EnergyMeasurementUnit = "WH"

I think this should be solved with support for nested schemas. What do you think. (this was generatated with the lastest commit of your feature/implement-subschemas branch

Quick note these schemas are taken from the OSCP specification

omissis commented 1 year ago

thanks for the example. that looks weird, did you use the stable release to generate the go model or is that the result of the feature/implement-subschemas branch?

glvr182 commented 1 year ago

That is the result of both the stable and feature/implement-subschema

omissis commented 1 year ago

ah yeah sorry I didn't see that note. ok I'll be looking into it in the next few days, I am working with a fairly complex schema these days and I haven't noticed that behavior.

glvr182 commented 1 year ago

This is a hack that my colleague made a while ago on a older version of your project. It made it work but we don't know if it broke other things so we want to do it properly this time. Does this give you any idea what might be the reason.

--- a/pkg/generator/generate.go
+++ b/pkg/generator/generate.go
@@ -388,12 +388,16 @@ func (g *schemaGenerator) generateDeclaredType(
        return &codegen.NamedType{Decl: decl}, nil
    }

+   if decl, ok := g.output.declsByName[scope.string()]; ok {
+       return &codegen.NamedType{Decl: decl}, nil
+   }
+
    if t.Enum != nil {
        return g.generateEnumType(t, scope)
    }

    decl := codegen.TypeDecl{
-       Name:    g.output.uniqueTypeName(scope.string()),
+       Name:    scope.string(),
        Comment: t.Description,
    }
    g.output.declsBySchema[t] = &decl
@@ -718,7 +722,7 @@ func (g *schemaGenerator) generateEnumType(
    }

    enumDecl := codegen.TypeDecl{
-       Name: g.output.uniqueTypeName(scope.string()),
+       Name: scope.string(),
        Type: enumType,
    }
    g.output.file.Package.AddDecl(&enumDecl)
@@ -800,22 +804,6 @@ type output struct {
    warner        func(string)
 }

-func (o *output) uniqueTypeName(name string) string {
-   if _, ok := o.declsByName[name]; !ok {
-       return name
-   }
-   count := 1
-   for {
-       suffixed := fmt.Sprintf("%s_%d", name, count)
-       if _, ok := o.declsByName[suffixed]; !ok {
-           o.warner(fmt.Sprintf(
-               "Multiple types map to the name %q; declaring duplicate as %q instead", name, suffixed))
-           return suffixed
-       }
-       count++
-   }
-}
-
 type cachedEnum struct {
    values []interface{}
    enum   *codegen.TypeDecl
pedroscaff commented 7 months ago

hello, is there any update on this? I tried the feature/implement-subschema branch and it worked out of the box to avoid duplicates, although I still got a warning: go-jsonschema: Warning: Multiple types map to the name "TypeName"; declaring duplicate as "TypeName" instead. It is a feature that would help me a lot :slightly_smiling_face:, I'm also happy to help with the development if the steps are somehow clear.

omissis commented 7 months ago

Hey! no updates so far. That branch fell behind main and I need to take some time to rebase it before I can release it. 😅

pedroscaff commented 7 months ago

Hey! no updates so far. That branch fell behind main and I need to take some time to rebase it before I can release it. 😅

thanks for the update, can you give like a days/weeks/months prediction for it? as I said I can also help if there are somewhat clear steps in the PR :slightly_smiling_face:

omissis commented 7 months ago

I am on it today at the Open Source Saturday Milan 😎 Fingers crossed!

CarbonC commented 6 months ago

Hello @omissis and thank you for this great package! I just tested feature/implement-subschema on my schemas and it feels way better than the current main (my case is simply having 2 schemas using the same subschema to generate). I saw that you fixed the tests, can we expect a merge in main soon? Thanks again PS: by the way is multiple output files on an output tree (like one for each schema when using a path like /*/.schema.json) supported/on the roadmap?

omissis commented 6 months ago

Hey @CarbonC , the idea is to merge it to main in the next release. I have completed a long due rebase, and I now probably need to to one more to align it to main again. Since it's kind of a big change, I don't want to rush it and break (too many) things: my plan is to take some time to go through the open issues and comments and maybe some big known json schemas and see how things look like with this new version. I don't have a timeline for that though I'm afraid, so if anyone is willing to step up and help with additional testing, they're more than welcome :)

omissis commented 6 months ago

Well since I was here, I took the chance to rebase: the PR is conflict-free again now.