accosmin-org/libnano - Githubissues

libnano

Description

Libnano implements parameter-free and flexible machine learning algorithms complemented by an extensive collection of numerical optimization algorithms. The implementation is cross-platform (tested on recent versions of Linux, macOS and Windows) with minimal dependencies (standard library and Eigen3) and it follows recent C++ standards and core guidelines. The library uses modern CMake and as such it is easy to install and to package.

Numerical optimization module

The library implements state-of-the-art algorithms for both unconstrained and constrained numerical optimization problems. Additionally builtin test functions of varying number of dimensions are provided for benchmarking these algorithms. Some of these test functions are specific to ML applications like logistic regression or multivariate linear regression with various loss functions and synthetic data.

Examples:

Algorithm	Application
`L-BFGS`	unconstrained smooth nonlinear optimization
quasi-Newton methods (e.g. `BFGS`)	unconstrained smooth nonlinear optimization
non-linear conjugate gradient descent (CGD) methods	unconstrained smooth nonlinear optimization
optimal sub-gradient algorithm (`OSGA`)	unconstrained smooth/non-smooth nonlinear optimization
proximal bundle methods (e.g. `RQB`)	unconstrained smooth/non-smooth nonlinear optimization
primal-dual interior-point method	`linear and quadratic programs`
penalty methods	constrained nonlinear optimization
`augmented lagrangian` method	constrained nonlinear optimization

Machine learning module

The machine learning (ML) module is designed to as generic and as customizable as possible. As such various important ML concepts (e.g. loss function, hyper-parameter tuning strategy, numerical optimization solver, dataset splitting strategy, feature generation, weak learner) are modelled using appropriate orthogonal interfaces which can be extended by the user to particular machine learning applications. Additionally the implementation follows strictly the scientific principles of statistical learning to properly tune and evaluate the ML models.

In particular the following requirements were considered when designing the API:

all ML models should work with any feature type (e.g. categorical, continuous or structures) and even with missing feature values.
all features should be labeled with a meaningful name, a type (e.g. categorical, continuous), shape (if applicable) and labels (if applicable). This is important for model analysis (e.g. feature importance) and debugging, for designing appropriate feature selection methods (e.g. weak learners) and feature generation.
a ML dataset must be able to handle efficiently arbitrary sets of features (e.g. categorical, continuous, structured like images or time series). The feature values can be optional (missing) and of different storage (e.g. signed or unsigned integers of various byte sizes, single or double precision floating point numbers). Additional features can be constructed on the fly or cached by implementing the appropriate interface.
all ML models should work with any loss function. This is modeled using an appropriate interface which can be extended by the user. For example two of the most used ML models, like linear models and gradient boosting, are easy to extend to any loss function.
the hyper-parameters to regularize ML models are automatically tuned using standard model evaluation protocols (e.g. cross-validation). The values of the hyper-parameters are fixed a-priori to reduce the risk of overfitting the validation dataset or of adjusting the parameter grid based on the results on the test dataset. The user can override the tuning strategy (e.g. local search) and the evaluation protocol (e.g. cross-validation, bootstrapping) using appropriate interfaces. For example the builtin parameter grid is adapted to standard regularization methods of linear models (e.g. like in lasso, ridge, elastic net). Note that models with few hyper-parameters are prefered as they are simpler to understand and tune.
gradient boosting models should work with arbitrary weak learners. Standard weak learners (e.g. decision trees, decision stumps, lool-up-tables, linear models) are builtin. The user can implement new weak learners using the appropriate interface.

Note that the library makes heavy use of its own implementation of tensors of arbitrary rank and scalar type designed for machine learning applications. This uses Eigen3 as backend and as such fast and easy-to-use linear algebra operations are readily available.

Examples:

Interface	Description	Examples of builtin implementations
`loss_t`	loss function	hinge, logistic, mse
`datasource_t`	in-memory collection of potentially heterogenous features	missing feature values, continuous or categorical features
`generator_t`	feature generation	2D gradient, pairwise product
`splitter_t`	dataset splitting	bootstrapping, k-fold cross-validation, random splitting
`tuner_t`	hyper-parameter tuning	local search, quadratic model
`wlearner_t`	weak learner	look-up table, decision stump, decision tree
`linear_t`	linear model	lasso, ridge regression, elastic net