Open ErikOrjehag opened 3 years ago
Hello. As I can see it, it is impossible to do. Trafaret are functions signature and collection of combinators, thats all, nothing more. Trafaret does not represent any schema, this is transformer of data. Nor it is an validator etc. But you can use trafaret-schema package to use actual json schema with trafaret transformation abilities. So in my opinion one should use json schema first and trafaret as convenient tool to work with described schema. Of maybe some other json-schema tool for python, if it is match best your requirements.
Thanks for the reply, and for the awesome package. I use trafaret together with trafaret_config and click (cli framework). I was hoping that it would be possible to go the other way around, meaning from trafaret -> json schema. Maybe not because it's the best way to do it but because I already have a pretty large trafaret setup (parsing a special purpose database query language that I invented and that uses yaml as it's syntax) and I really don't want to spend too much time defining the whole schema again in json-schema. You say trafaret is not a schema but it kind of represent's a schema doesn't it? I would be able to traverse my big trafaret from the root and produce a json-schema from it?
I do have some recursion in my trafarets, to parse things like {"not": {"not": {"not": true}}} -> Not(Not(Not(true)))
, idk if that complicates things.
It is possible of course, you can traverse trafarets like List and Dict. Or custom ones, that you know about. But not the most straightforward task. I can help if you will have some questions down the road.
Turns out trafaret itself was the perfect tool for the job. I was able to create a trafaret that takes as it's input a trafaret and transforms it into a json schema. Also handling recursive trafarets by putting $defs in the json schema:
def ref(defs: dict, trafaret_instance: trafaret, schema_fn: tp.Callable) -> dict:
trafaret_id = id(trafaret_instance)
if trafaret_id not in defs:
defs[trafaret_id] = 'lazy'
defs[trafaret_id] = schema_fn()
return {'$ref': f'#/$defs/{trafaret_id}'}
def to_json_schema(traf: trafaret) -> dict:
defs = dict() # Dictionary of json schema definitions ($defs)
json_schema = trafaret.Or()
# Simple Float, String, Regexp, ToDateTime and Null schemas are small so we put the schemas
# inline without creating definitions that we then reference.
float_schema = trafaret.Type(trafaret.Float) >> (lambda x: {
'type': 'number'
})
string_schema = trafaret.Type(trafaret.String) >> (lambda x: {
'type': 'string'
})
regexp_schema = trafaret.Type(trafaret.Regexp) >> (lambda x: {
'type': 'string',
'pattern': x.regexp.pattern
})
to_datetime_schema = trafaret.Type(trafaret.ToDateTime) >> (lambda x: {
'type': 'string'
})
null_schema = trafaret.Type(trafaret.Null) >> (lambda x: {'type': 'null'})
# Special case And(t, Callable) is used when a function is put after the trafaret using
# the >> operator, in this case we should evaluate only the trafaret and ignore the callable.
and_schema = trafaret.Type(trafaret.And) >> (lambda x: json_schema.check(x.trafaret))
# Enums can be large so we want to create definitions for them and reference the definitions.
# VSCode auto completion works better for 'oneOf': 'const' instead of using 'enum' directly...
enum_schema = trafaret.Type(trafaret.Enum) >> (lambda x: ref(defs, x, lambda: {
'oneOf': [{'const': name} for name in x.variants]
}))
# Lists, Or and Dictionaries can contain them selfs recursively so we want to create definitions
# for them that we can reference in order to prevent infinite recursion depth.
list_schema = trafaret.Type(trafaret.List) >> (lambda x: ref(defs, x, lambda: {
'type': 'array',
'items': json_schema.check(x.trafaret),
'minItems': x.min_length,
'maxItems': x.max_length,
}))
or_schema = trafaret.Type(trafaret.Or) >> (lambda x: ref(defs, x, lambda: {
'oneOf': [json_schema.check(t) for t in x.trafarets]
}))
dict_schema = trafaret.Type(trafaret.Dict) >> (lambda x: ref(defs, x, lambda: {
'type': 'object',
'additionalProperties': False,
'properties': {
k.name: json_schema.check(k.trafaret) for k in x.keys
},
'required': [k.name for k in x.keys if not k.optional]
}))
json_schema.trafarets = [
float_schema, string_schema, regexp_schema, to_datetime_schema, null_schema,
and_schema, enum_schema, list_schema, or_schema, dict_schema,
]
res = json_schema.check(traf)
schema = {
**res,
'$defs': defs,
}
return schema
Hi,
Would it be possible to generate a JSON schema (https://json-schema.org/) from the trafaret definition? This would be really useful because the schema can be used in the IDE (vscode in my case) to autocomplete the yaml file as you are typing (https://github.com/redhat-developer/yaml-language-server). Any thoughts on this?