Cache `create_fsm_index_end_to_end`

This is related to https://github.com/outlines-dev/outlines/issues/303 by way of the FSMInfo objects (i.e. one of the arguments to create_fsm_index_end_to_end). The latter contain Numba typed collections that are also not serializable (i.e. https://github.com/numba/numba/issues/4698).

We could change create_fsm_index_end_to_end so that it takes the BetterFSM objects directly, and, by making the serialization of BetterFSM instances ignore their BetterFSM._fsm_info values, we should be able to avoid that issue. It would require the re-creation of BetterFSM._fsm_info, but we could at least cache those in memory so that they wouldn't need to be recreated during the lifetime of a Python session.

dottxt-ai / outlines

Cache `create_fsm_index_end_to_end` #339

What behavior of the library made you think about the improvement?

How would you like it to behave?