daggaz / json-stream

Simple streaming JSON parser and encoder.
MIT License
122 stars 18 forks source link

Simplify `isinstance()` Checks #50

Closed greateggsgreg closed 1 year ago

greateggsgreg commented 1 year ago

Currently, isinstance() checks to verify a certain deserialised structure are harder than they ought to be. The streaming JSON classes do not inherit from any collections ABCs, nor are they registered as virtual subclasses of collections ABCs. The streaming JSON classes are also not exported by the json_stream module, so type information for isinstance() checks must be obtained in a roundabout manner:

TransientStreamingJSONList = type(json_stream.load(BytesIO(b'[]')))
TransientStreamingJSONObject = type(json_stream.load(BytesIO(b'{}')))
PersistentStreamingJSONList = type(json_stream.load(BytesIO(b'[]'), persistent=True))
PersistentStreamingJSONObject = type(json_stream.load(BytesIO(b'{}'), persistent=True))

ABCMeta.register() allows non-inherited classes to be registered as implementing the same interface as if they had directly inherited from a certain abstract base class. My suggestion is to register TransientStreamingJSONList and PersistentStreamingJSONList with the Sequence ABC and likewise register TransientStreamingJSONObject and PersistentStreamingJSONObject with the Mapping ABC.

Sequence.register(TransientStreamingJSONList)
Sequence.register(PersistentStreamingJSONList)
Mapping.register(TransientStreamingJSONObject)
Mapping.register(PersistentStreamingJSONObject)

Ideally, something like the following would work:

 with open(file_path, "rb") as f:
    main_list = json_stream.load(f)
    if not isinstance(main_list, Sequence):
        raise ValueError(f"Root object must be an array: {file_path}")
    for item in main_list:
        if isinstance(item, Mapping):
            ...
        elif isinstance(item, Sequence):
            ...
        else:
            ...

One alternative to registering with the collections ABCs would be to replace ABC in each streaming JSON class's bases list, directly inheriting from the appropriate collections ABCs. However, some of the inherited generic-implementation methods may be suboptimal for the streaming use-case. Another alternative might be to export some of the classes from the json_stream module for easier type-checking, e.g. StreamingJSONList and StreamingJSONObject.

This would make the Pythonic dict / list-like interface a bit more seamless.