flashlight / text

Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.
MIT License
64 stars 15 forks source link

Add Pickle support for LexiconFreeDecoder and options #25

Closed jacobkahn closed 1 year ago

jacobkahn commented 1 year ago

Summary: Adds support for pickling instances of LexiconFreeDecoderOptions and LexiconFreeDecoder which is needed for pyper training/integration.

Lexicon-free decoding is the only decoding type currently supported for serialization; it's also the only type for which serialization of any kind makes sense given that decoding state is implemented with opaque pointer types, and reproducing it is expensive and requires breaking a lot of abstraction. Serializing a Lexicon/Trie is also difficult due to how they're efficiently constructed in memory, so it is likely more efficient to simply serialize an uncompressed token set, then deserialize when using a decoder.

Since there's no way to reliably serialize LMs, only LexiconFreeDecoders with ZeroLMs can be serialized.

Reviewed By: redraven984

Differential Revision: D40951537

facebook-github-bot commented 1 year ago

This pull request was exported from Phabricator. Differential Revision: D40951537

facebook-github-bot commented 1 year ago

This pull request was exported from Phabricator. Differential Revision: D40951537