explosion / srsly

🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)
MIT License
438 stars 31 forks source link

How to use `srsly.msgpack_dumps` with my custom class? #20

Closed tamuhey closed 3 years ago

tamuhey commented 4 years ago

I want to serialize my custom class with srsly.msgpack_dumps, because it is stored in spacy.Doc. In other words, doc.to_disk fails because my custom class cannotn be serialized with srsly.msgpack_dumps. How to make my custom class to be able to save?

adrianeboyd commented 4 years ago

Hi, did you figure out a solution to this? If not, what does your custom class look like and what errors do you see with msgpack?

tamuhey commented 4 years ago

Hi, I don't fix this issue. The class is here: https://github.com/PKSHATechnology-Research/camphr/blob/4e364543d65124a4da444044217f5470a11a969d/camphr/torch_utils.py#L44

And the error message is:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-0c69693f734e> in <module>
----> 1 doc.to_bytes()

doc.pyx in spacy.tokens.doc.Doc.to_bytes()

~/Library/Caches/pypoetry/virtualenvs/camphr-v19AnSgn-py3.7/lib/python3.7/site-packages/spacy/util.py in to_bytes(getters, exclude)
    623         # Split to support file names like meta.json
    624         if key.split(".")[0] not in exclude:
--> 625             serialized[key] = getter()
    626     return srsly.msgpack_dumps(serialized)
    627 

doc.pyx in spacy.tokens.doc.Doc.to_bytes.lambda8()

~/Library/Caches/pypoetry/virtualenvs/camphr-v19AnSgn-py3.7/lib/python3.7/site-packages/srsly/_msgpack_api.py in msgpack_dumps(data)
     14     RETURNS (bytes): The serialized bytes.
     15     """
---> 16     return msgpack.dumps(data, use_bin_type=True)
     17 
     18 

~/Library/Caches/pypoetry/virtualenvs/camphr-v19AnSgn-py3.7/lib/python3.7/site-packages/srsly/msgpack/__init__.py in packb(o, **kwargs)
     38     Pack an object and return the packed bytes.
     39     """
---> 40     return Packer(**kwargs).pack(o)
     41 
     42 

_packer.pyx in srsly.msgpack._packer.Packer.pack()

_packer.pyx in srsly.msgpack._packer.Packer.pack()

_packer.pyx in srsly.msgpack._packer.Packer.pack()

_packer.pyx in srsly.msgpack._packer.Packer._pack()

_packer.pyx in srsly.msgpack._packer.Packer._pack()

TypeError: can not serialize 'TransformersInput' object
ines commented 3 years ago

Just released v2.4.0 of srsly, which includes support for this via function registries. See the PR for an example: https://github.com/explosion/srsly/pull/47

The feature currently still considered semi-internals, and we might make some changes to the function API in a future update and remove that chain argument (which is kinda annoying and shouldn't leak into the user API, but it's a leftover from the original msgpack API).