Frequently it's not enough to have an in-memory representation of a model or the optimization algorithm state. We want to serialize the structure, parameters, and state of the models and sub-learners that we're working with.
I'd like to take this one step further, and define a generic "spec" for these stateful items that we can both serialize/deserialize, but which could be loaded into another "backend" easily. I'm using the same terminology as Plots on purpose... I think there are a lot of similarities in the problems we're looking to solve.
Some examples of backends:
Transformations, etc
Optim
TensorFlow
Theano
BayesNet
The idea is that, where there is overlapping functionality, there is the opportunity to generalize. Suppose we have a general concept "I want a 3 layer neural net with these numbers of nodes and relu activations, and these initial weight values". Lots of software implements this. If we build this information into a generic spec (similar to the Plot object in Plots), then we only need to connect the spec to a backend's constructor, and we have the ability to convert and transfer models between backends.
The same goes for optimization routines... many times there is a 1-1 mapping between, for example, an Adam optimizer in TensorFlow and Theano. But each backend reimplements the concept in a different way, and reinvents the wheel completely. The Plots model applies here as well. Define a generic concept Adam updater, then map from that to a backend's implementation.
The end result is that I'd like for models and learning algorithms to be built from specs, which then get mapped to sub-learners specific to a particular spec. This would allow us to serialize/deserialize a backend-agnostic spec, not actual Julia objects.
For example, it would allow us to experiment with structures/models/algos in something more flexible (JuliaML I hope), and then convert to a TensorFlow graph for pounding the GPU or sharing with other stubborn researchers that aren't using Julia.
Frequently it's not enough to have an in-memory representation of a model or the optimization algorithm state. We want to serialize the structure, parameters, and state of the models and sub-learners that we're working with.
I'd like to take this one step further, and define a generic "spec" for these stateful items that we can both serialize/deserialize, but which could be loaded into another "backend" easily. I'm using the same terminology as Plots on purpose... I think there are a lot of similarities in the problems we're looking to solve.
Some examples of backends:
The idea is that, where there is overlapping functionality, there is the opportunity to generalize. Suppose we have a general concept "I want a 3 layer neural net with these numbers of nodes and relu activations, and these initial weight values". Lots of software implements this. If we build this information into a generic spec (similar to the
Plot
object in Plots), then we only need to connect the spec to a backend's constructor, and we have the ability to convert and transfer models between backends.The same goes for optimization routines... many times there is a 1-1 mapping between, for example, an Adam optimizer in TensorFlow and Theano. But each backend reimplements the concept in a different way, and reinvents the wheel completely. The Plots model applies here as well. Define a generic concept Adam updater, then map from that to a backend's implementation.
The end result is that I'd like for models and learning algorithms to be built from specs, which then get mapped to sub-learners specific to a particular spec. This would allow us to serialize/deserialize a backend-agnostic spec, not actual Julia objects.
For example, it would allow us to experiment with structures/models/algos in something more flexible (JuliaML I hope), and then convert to a TensorFlow graph for pounding the GPU or sharing with other stubborn researchers that aren't using Julia.
This is closely related to https://github.com/tbreloff/Plots.jl/issues/390, and I think design decisions can be used for both.