nijikokun / generate-schema

🧞 Convert JSON Objects to MySQL, JSON Schema, Mongoose, Google BigQuery, Swagger, and more.
MIT License
1.04k stars 136 forks source link

BigQuery: Look at more objects of an array for type inference #50

Open sqlrob opened 5 years ago

sqlrob commented 5 years ago

When generating a schema for BigQuery, only the first member of the array is looked at for schema generation. If this object contains optional fields, looking at one member may not be sufficient.

If you generate a schema from the following object, and then try to insert that object with the options {ignoreUnknownValues: false}, it fails.

{
  arrayField: [ { snafu: true }, { snafu: true, foobar: true } 
}
janerikmai commented 5 years ago

Same with JSON... { "test": [ { "a": 0, "b": 1 }, { "a": 2, "b": 3, "c": 4 }, { "a": 5, "c": 6 } ] }

results in

{ "$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": { "test": { "type": "array", "items": { "type": "object", "properties": { "a": { "type": "integer" }, "b": { "type": "integer" }, "c": { "type": "integer" } }, "required": [ "a", "b", "c" ] } } } }

where of all 3 entries in array the minimal property to be required is only a so required should be only a

noahbjohnson commented 5 years ago

@sqlrob @janerikmai I wrote a wrapper module that module that accomplishes this basic idea for BigQuery. Not too fancy, but is working for our use case. If someone wanted to open a PR here using that logic that'd be great, but I'll get around to it eventually.