Existing Rust ML Solutions

Leaf & Collenchyma

This framework focuses narrowly on just getting the most basic layers and the operations. It is very old and not maintained.

Tensors

Leaf is an ML framework that uses its own custom backend-agnostic tensor library called Collenchyma. The tensor type is SharedTensor. This tensor type is not parameterized by any backend, but the backend must be passed when creating the tensor. This means that the backend associated with the tensor can be ignored, which may simplify type bounds on functions that operate on a tensor. It can even have the tensor on multiple backends at the same time, hence the "Shared". The only things you can do with the tensor itself is to reshape it or extract the memory.

Backends

Collenchyma supports multiple backends: Native, OpenCL, and CUDA. It does this in most places using enums:

All types of contexts can be used simultaneously in the same binary. However, since collenchyma builds all the backends into itself, this means that if you depend on collenchyma you now have Native, OpenCL, and CUDA backend code build into your dependency tree. For this reason, the approach of Collenchyma is probably not that great. This approach does have the advantage that tensors can freely be shared among backends, handled by Collenchyma.

The actual operations exist on the backend. For instance, the CUDA backend can execute the sigmoid operation:

backend.sigmoid(&mut x, &mut result).unwrap();

Rusty Machine

Docs

This framework is old and not maintained.

Tensors

This framework uses rulinalg for its tensors. This crate has fallen out of favor of nalgebra. It doesn't even support 3d tensors, and is not worth considering in a modern application. This also means it has no support for GPUs.

Learning

You can find the docs for learning here: https://athemathmo.github.io/rusty-machine/doc/rusty_machine/learning/index.html

It is clear that there is not an emphasis on traditional neural networks.

mli

https://github.com/vadixidav/mli

This framework was written more recently (by me!). This framework has enough built in tools to create basic convolutional neural networks, with some examples. Only native backends are currently supported. It is not currently actively maintained.

Tensors

This framework has no tensor type. Instead, it only supplies abstractions to chain ops together. These ops then typically depend on a tensor type. For instance, the sigmoid op's forward looks like this:

impl Forward for Logistic {
    type Input = f32;
    type Internal = ();
    type Output = f32;

    fn forward(&self, &input: &f32) -> ((), f32) {
        ((), logistic(input))
    }
}

To run this on a whole tensor, this is where you use mli-ndarray:

pub struct Map3One<G>(pub G);

impl<G> Forward for Map3One<G>
where
    G: Forward,
{
    type Input = Array3<G::Input>;
    type Internal = Array3<G::Internal>;
    type Output = Array3<G::Output>;

    fn forward(&self, input: &Self::Input) -> (Self::Internal, Self::Output) {
        let both_vec: Vec<(G::Internal, G::Output)> =
            input.iter().map(|input| self.0.forward(input)).collect();
        let (internal_vec, output_vec) = both_vec.into_iter().fold(
            (vec![], vec![]),
            |(mut internal_vec, mut output_vec), (internal, output)| {
                internal_vec.push(internal);
                output_vec.push(output);
                (internal_vec, output_vec)
            },
        );
        let internal_array = Array::from_shape_vec(input.raw_dim(), internal_vec).unwrap();
        let output_array = Array::from_shape_vec(input.raw_dim(), output_vec).unwrap();
        (internal_array, output_array)
    }
}

This struct wraps an op that operates on one item, and lets it span across a whole array. There is a similar impl for the backpropogation:

impl<G> Backward for Map3One<G>
where
    G: Backward,
    G::TrainDelta: Clone + Add + Zero,
{
    type OutputDelta = Array3<G::OutputDelta>;
    type InputDelta = Array3<G::InputDelta>;
    type TrainDelta = G::TrainDelta;

    fn backward(
        &self,
        input: &Self::Input,
        internal: &Self::Internal,
        output_delta: &Self::OutputDelta,
    ) -> (Self::InputDelta, Self::TrainDelta) {
        let both_vec: Vec<(G::InputDelta, G::TrainDelta)> =
            izip!(input.iter(), internal.iter(), output_delta.iter(),)
                .map(|(input, internal, output_delta)| {
                    self.0.backward(input, internal, output_delta)
                })
                .collect();
        let (input_delta_vec, train_delta_vec) = both_vec.into_iter().fold(
            (vec![], vec![]),
            |(mut input_delta_vec, mut train_delta_vec), (input_delta, train_delta)| {
                input_delta_vec.push(input_delta);
                train_delta_vec.push(train_delta);
                (input_delta_vec, train_delta_vec)
            },
        );
        let input_delta_array = Array::from_shape_vec(input.raw_dim(), input_delta_vec).unwrap();
        let train_delta_array = Array::from_shape_vec(input.raw_dim(), train_delta_vec).unwrap();
        (input_delta_array, train_delta_array.sum())
    }
}

As you can see, the input and output types are specific to ndarray. This means that you can write ops that are as specific as they need to be.

Backends

mli makes the backend chosen by the code itself. For instance, once you have used Map3One from mli-ndarray, that piece of the graph now only runs with ndarray. Additionally, this means that the graph is "static". The graph cannot be stored to disk and loaded. This is not a problem in and of itself. Some pros and cons:

Pros

It is easy to write code that is backend agnostic and integrate it with backend-specific code
The compiler can inline and optimize native code so CPU implementations can run faster

Cons

Since the backend is not specified, any backend context (such as for a GPU) must be stored globally or in the tensor type
You cannot have a black box model that can be swapped by changing a file
- You can trained weights, but not the actual graph itself
We cannot perform optimizations on the graph (like fusing or rearanging ops) since it is compiled as code and not stored in memory

deep

deep didn't even get past the first PR adding the graph (until I just merged it, right now). Here is what it does:

Tensors

Tensors in deep are specificed by the backend. This means that if you are using em (Emu) you can use DeviceBox<[f32]> as your Tensor type. This is non-ideal since we would like to have tensors containing other types than f32. Unfortunately, this cannot be done without Generic Associated Types (GATs). A small example from the RFC:

impl PointerFamily for RcFamily {
    type Pointer<T> = Rc<T>;
    fn new<T>(value: T) -> Self::Pointer<T> {
        Rc::new(value)
    }
}

As you can see, the line type Pointer<T> = Rc<T>; creates an associated type with a type parameter. This type parameter then parameterizes Rc. We need this functionality to allow deep to achieve the same thing with tensors:

type Tensor<T> = DeviceBox<[T]>;

As you can see, now we can have backend tensors with arbitrary types. This gets even better:

type Tensor<T, const S> = Array<T, S>;

This is what it would look like once GATs and const generics are merged. This would allow us to pass a shape to the underlying tensor. On native systems, this could mean huge performance gains since algorithms can be tuned at compile-time to work with particular shapes and filter sizes. Unfortunately, this is a far-off thing. A better solution for now might be to use something like em to get a specific tensor type we can parameterize.

rust-ml / wg

Prior approaches #2

Existing Rust ML Solutions

Leaf & Collenchyma

Tensors

Backends

Rusty Machine

Tensors

Learning

mli

Tensors

Backends

Pros

Cons

deep

Tensors