Libnano implements parameter-free and flexible machine learning algorithms complemented by an extensive collection of numerical optimization algorithms. The implementation is cross-platform
(tested on recent versions of Linux, macOS and Windows) with minimal dependencies
(standard library and Eigen3) and it follows recent C++ standards and core guidelines. The library uses modern CMake and as such it is easy to install and to package.
The library implements state-of-the-art algorithms for both unconstrained and constrained numerical optimization problems
. Additionally builtin test functions of varying number of dimensions are provided for benchmarking these algorithms. Some of these test functions are specific to ML applications like logistic regression or multivariate linear regression with various loss functions and synthetic data.
Examples:
Algorithm | Application |
---|---|
L-BFGS |
unconstrained smooth nonlinear optimization |
quasi-Newton methods (e.g. BFGS ) |
unconstrained smooth nonlinear optimization |
non-linear conjugate gradient descent (CGD) methods | unconstrained smooth nonlinear optimization |
optimal sub-gradient algorithm (OSGA ) |
unconstrained smooth/non-smooth nonlinear optimization |
proximal bundle methods (e.g. RQB ) |
unconstrained smooth/non-smooth nonlinear optimization |
primal-dual interior-point method | linear and quadratic programs |
penalty methods | constrained nonlinear optimization |
augmented lagrangian method |
constrained nonlinear optimization |
The machine learning (ML) module is designed to as generic and as customizable as possible. As such various important ML concepts (e.g. loss function, hyper-parameter tuning strategy, numerical optimization solver, dataset splitting strategy, feature generation, weak learner) are modelled using appropriate orthogonal interfaces which can be extended by the user to particular machine learning applications. Additionally the implementation follows strictly the scientific principles of statistical learning to properly tune and evaluate the ML models.
In particular the following requirements were considered when designing the API:
all ML models should work with any feature type
(e.g. categorical, continuous or structures) and even with missing feature values
.
all features should be labeled with a meaningful name, a type (e.g. categorical, continuous), shape (if applicable) and labels (if applicable). This is important for model analysis (e.g. feature importance) and debugging, for designing appropriate feature selection methods (e.g. weak learners) and feature generation.
a ML dataset must be able to handle efficiently arbitrary sets of features
(e.g. categorical, continuous, structured like images or time series). The feature values can be optional (missing) and of different storage (e.g. signed or unsigned integers of various byte sizes, single or double precision floating point numbers). Additional features can be constructed on the fly or cached by implementing the appropriate interface.
all ML models should work with any loss function
. This is modeled using an appropriate interface which can be extended by the user. For example two of the most used ML models, like linear models and gradient boosting, are easy to extend to any loss function.
the hyper-parameters to regularize ML models are automatically tuned
using standard model evaluation protocols (e.g. cross-validation). The values of the hyper-parameters are fixed a-priori to reduce the risk of overfitting the validation dataset or of adjusting the parameter grid based on the results on the test dataset. The user can override the tuning strategy (e.g. local search) and the evaluation protocol (e.g. cross-validation, bootstrapping) using appropriate interfaces. For example the builtin parameter grid is adapted to standard regularization methods of linear models (e.g. like in lasso, ridge, elastic net). Note that models with few hyper-parameters are prefered as they are simpler to understand and tune.
gradient boosting models should work with arbitrary weak learners
. Standard weak learners (e.g. decision trees, decision stumps, lool-up-tables, linear models) are builtin. The user can implement new weak learners using the appropriate interface.
Note that the library makes heavy use of its own implementation of tensors of arbitrary rank and scalar type
designed for machine learning applications. This uses Eigen3 as backend and as such fast and easy-to-use linear algebra operations are readily available.
Examples:
Interface | Description | Examples of builtin implementations |
---|---|---|
loss_t |
loss function | hinge, logistic, mse |
datasource_t |
in-memory collection of potentially heterogenous features | missing feature values, continuous or categorical features |
generator_t |
feature generation | 2D gradient, pairwise product |
splitter_t |
dataset splitting | bootstrapping, k-fold cross-validation, random splitting |
tuner_t |
hyper-parameter tuning | local search, quadratic model |
wlearner_t |
weak learner | look-up table, decision stump, decision tree |
linear_t |
linear model | lasso, ridge regression, elastic net |