Closed jacobkahn closed 1 year ago
@jacobkahn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
This pull request was exported from Phabricator. Differential Revision: D42038797
@jacobkahn merged this pull request in flashlight/text@f7b6e13cfde21a45ff30336103356f512b3f15e3.
Summary
Bind Seq2Seq/autoregressive beam search decoders from Flashlight Text to Python.
Two notable subtleties in how the bindings are structured to avoid overhead:
EmittingModelStatePtr
(a typedef'edstd::shared_ptr<void>
) is exposed to Python interop viastd::shared_ptr<py::object>
(which itself is a reference counted wrapper aroundPyObject*
.shared_ptr
will properly modify refcounts of thepy::object
such that there aren't lifetime issues round-trip -- if Python garbage collects autoregressive model state, refcount will be > 0 if being used in the decoder.get_obj_from_emitting_model_state
andcreate_emitting_model_state
can create this type from arbitrary Python objects with ~no overhead.py::object
refers to the same underlying memory/handle/is COWEmittingModelUpdateFunc
, which is the autoregressive callback defined in Python but called in C++ to get incremental model token scores and model state. This closure is passed from Python --> C++ once at decoder construction; a function pointer's stored in C++ to the Python callable. Opaque types preclude copies of scores from args or return vals -- this will be more carefully investigated/improved over time.Tests are self-documenting for now for the
LexiconFreeSeq2Seq
variant.Checklist