“Speed is the most important feature.”
Fred Wilson
The Tsetlin Machine library with zero external dependencies performs quite well. Over 50 million MNIST predictions per second is achieved on a desktop CPU.
Here is a quick "Hello, World!" example of a typical use case.
Importing the necessary functions and MNIST dataset:
using MLDatasets: MNIST
using .Tsetlin: TMInput, TMClassifier, train!, predict, accuracy, save, load, unzip
x_train, y_train = unzip([MNIST(:train)...])
x_test, y_test = unzip([MNIST(:test)...])
Booleanizing input data:
x_train = [TMInput(vec([
[x > 0 ? true : false for x in i];
[x > 0.5 ? true : false for x in i];
])) for i in x_train]
x_test = [TMInput(vec([
[x > 0 ? true : false for x in i];
[x > 0.5 ? true : false for x in i];
])) for i in x_test]
There are some different hyperparameters compared to the Vanilla Tsetlin Machine.
The hyperparameter R
is a float in the range of 0.0
to 1.0
.
To get the actual R
from the Vanilla S
parameter, use the following formula: R = S / (S + 1)
.
The hyperparameter L
limits the number of included literals in a clause.
best_tms_size
is the number of the best TM models collected during the training process.
After training, you can save this ensemble of models to your drive or increase accuracy by using Binomial Combinatorial Merge with the combine()
function.
const EPOCHS = 1000
const CLAUSES = 2048
const T = 32
const R = 0.94
const L = 12
const best_tms_size = 500
Training the Tsetlin Machine over 1000 epochs and saving the best TM model to disk:
tm = TMClassifier(CLAUSES, T, R, L=L, states_num=256, include_limit=128)
tm_best, tms = train!(tm, x_train, y_train, x_test, y_test, EPOCHS, best_tms_size=best_tms_size, best_tms_compile=true, shuffle=true, batch=true)
save(tm_best, "/tmp/tm_best.tm")
Load the best Tsetlin Machine model and calculate the actual test accuracy:
tm = load("/tmp/tm_best.tm")
println(accuracy(predict(tm, x_test), y_test))
cd ./examples
julia --project=. -O3 -t 32 --gcthreads=32,1 mnist_simple.jl
where 32
is the number of your logical CPU cores.The maximum MNIST inference speed achieved is 52 million predictions per second in batch mode on a desktop CPU Ryzen 7950X3D, utilizing 32 threads.
Trained and optimized models can be found in ./examples/models/
.
How to run MNIST inference benchmark:
cd ./examples
julia --project=. -O3 -t 32 mnist_benchmark_inference.jl
where 32
is the number of your logical CPU cores.