jcrist / msgspec

A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
https://jcristharif.com/msgspec/
BSD 3-Clause "New" or "Revised" License
2.01k stars 59 forks source link

Multiple types to same Struct #693

Open unights opened 1 month ago

unights commented 1 month ago

Question

I'm using msgspec to decode some JSON like

{"type":"a.xxx","id":"a1"}
{"type":"a.yyy","id":"a2"}
{"type":"a.zzz","id":"a3"}
{"type":"b.xxx","name":"b1"}
{"type":"b.yyy","name":"b2"}
{"type":"b.zzz","name":"b3"}

and now, I am using the following code to process this data.

import msgspec

lst = [
    b'{"type":"a.xxx","id":"a1"}',
    b'{"type":"a.yyy","id":"a2"}',
    b'{"type":"a.zzz","id":"a3"}',
    b'{"type":"b.xxx","name":"b1"}',
    b'{"type":"b.yyy","name":"b2"}',
    b'{"type":"b.zzz","name":"b3"}',
    # ...
]

class TypeA(msgspec.Struct):
    type: str
    id: str

class TypeB(msgspec.Struct):
    type: str
    name: str

decoder = msgspec.json.Decoder()

for each in lst:
    dic = decoder.decode(each)
    type_str = dic["type"].split(".")[0]

    if type_str == "a":
        type_ = TypeA
    elif type_str == "b":
        type_ = TypeB
    # or other types
    else:
        continue

    m = msgspec.convert(dic, type=type_)
    print(m.type)

I have tried Tagged Union but I need to define every Struct.

import msgspec

class TypeA(msgspec.Struct, tag_field="type"):
    id: str

class TypeAX(TypeA, tag="a.xxx"):
    pass

class TypeAY(TypeA, tag="a.yyy"):
    pass

class TypeAZ(TypeA, tag="a.zzz"):
    pass

decoder = msgspec.json.Decoder(type=TypeAX | TypeAY | TypeAZ)

Do you have any better suggestions for this situation?

unights commented 1 month ago

Maybe I should not use Tagged Union in this case.

https://github.com/jcrist/msgspec/issues/338#issuecomment-1452442812

In my experience, the tag is an artifact of serialization/deserialization, and not something that needs to be accessed on the struct itself at runtime.