shuttle-hq / synth

The Declarative Data Generator
https://www.getsynth.com/
Apache License 2.0
1.39k stars 109 forks source link

Feature: Allow fields to be omitted from output, hiding them #177

Closed vlushn closed 3 years ago

vlushn commented 3 years ago

It is sometimes useful to generate a more complex structure and reference parts of it in the same namespace. For example, a database table might have day/month/year as 3 separate columns, and for the sake of consistency a single datetime has to be generated (to deal with cases like different months having different days etc.)

One approach to implement this is by declaring a field as hidden and omitting it from the output entirely.

Hypothetical example:

{
  "type": "object",
  "_date": {
    "type": "string",
    "hidden": true,
    "date_time": {
      "format": "%Y-%m-%d",
      "subtype": "naive_date",
      "begin": "1930-01-01",
      "end": "2010-01-01"
    }
  },
  "day": {
    "type": "number",
    "format": {
      "format": "{date@day}",
      "arguments": {
        "date": "@namespace.content.date"
      }
    }
  }
}
brokad commented 3 years ago

Hey @vlushn thanks for the request! Clearly, this is something we should support.

Initially I thought it'd be more straightforward to have all these "hidden" fields be contained in a 'special' JSON schema file which would never get serialized but could be referenced to from other files. But I think your solution is much better because you can control at what level in a nested structure you want fields to be consistent.

Are you thinking "hidden" would only be a valid attribute for fields of "type": "object"? Or would "hidden" be a valid attribute on any JSON schema node type? I think the latter would probably be better.

An alternative, which is similar could be to have an attribute "variables" on nodes of "type": "object" in which we can store all the "variables" we want generated but hidden from serialization. For instance:

{
  "type": "object",
  "variables": {
    "date": {
      "type": "string",
      "date_time": {
        "format": "%Y-%m-%d",
        "subtype": "naive_date",
        "begin": "1930-01-01",
        "end": "2010-01-01"
      }
    }
  },
  "day": {
    "type": "number",
    "format": {
      "format": "{date@day}",
      "arguments": {
        "date": "@namespace.variables.date"
      }
    }
  }
}

of course, in that case, this would only apply for nodes of type object.