dbt-labs / hologram

A library for automatically generating Draft 7 JSON Schemas from Python dataclasses
MIT License
9 stars 13 forks source link

Improve union-reading performance #32

Closed beckjake closed 4 years ago

beckjake commented 4 years ago

Previously, hologram called str on all exceptions it gets while trying to decode union members, by using getattr(exc, 'message', str(exc)). Python's eager evaluation of expressions means that str(exc) is always called.

This is nice for error messaging about decode failures, but ends up potentially calling pprint.pformat via jsonschema.ValidationError.__str__, which is extremely slow.

This PR avoids that by deferring all calls to str until the exception's __str__ is called. This improves performance by about an order of magnitude on union-heavy workloads from v0.0.8 (runtime from 660s -> 45s to parse a 6.3M manifest.json). That's still painfully slow, of course.