dottxt-ai / outlines

Structured Text Generation
https://dottxt-ai.github.io/outlines/
Apache License 2.0
8.25k stars 417 forks source link

Cache `create_fsm_index_end_to_end` #339

Open rlouf opened 10 months ago

rlouf commented 10 months ago

What behavior of the library made you think about the improvement?

The index compilation still takes a substantial amount of time even after having compiled it once. See profiling results here.

How would you like it to behave?

I would like the index to be fully cached after the first compilation.

brandonwillard commented 10 months ago

This is related to https://github.com/outlines-dev/outlines/issues/303 by way of the FSMInfo objects (i.e. one of the arguments to create_fsm_index_end_to_end). The latter contain Numba typed collections that are also not serializable (i.e. https://github.com/numba/numba/issues/4698).

We could change create_fsm_index_end_to_end so that it takes the BetterFSM objects directly, and, by making the serialization of BetterFSM instances ignore their BetterFSM._fsm_info values, we should be able to avoid that issue. It would require the re-creation of BetterFSM._fsm_info, but we could at least cache those in memory so that they wouldn't need to be recreated during the lifetime of a Python session.