⚙️ Project Status |
☎️ Contacts |
|||||
|
|
🌟 Exciting Update: We're delighted to introduce the brand new v0.1 documentation for
Auto-Sklong
! For a deep dive into the library's capabilities and features, please visit here.🎉 PyPi is available!: We published
Auto-Sklong
, here!
Auto-Scikit-Longitudinal
, also called Auto-Sklong
is an automated machine learning (AutoML) library designed to analyse
longitudinal data (Classification tasks focussed as of today) using various search methods. Namely,
Bayesian Optimisation
via SMAC3, Asynchronous Successive Halving
,
Evolutionary Algorithms
, and Random Search
via the General Automated Machine Learning Assistant (GAMA).
Auto-Sklong
built upon GAMA
, offers a brand-new search space to tackle the Longitudinal Machine Learning classification problems,
with a user-friendly interface, similar to the popular Scikit
paradigm.
Please for further information, visit the official documentation.
To install Auto-Sklong
, take these two easy steps:
Auto-Sklong
:pip install Auto-Sklong
You could also install different versions of the library by specifying the version number,
e.g. pip install Auto-Sklong==0.0.1
.
Refer to Release Notes
Auto-Sklong
incorporates via Sklong
a modified version of Scikit-Learn
called Scikit-Lexicographical-Trees
,
which can be found at this Pypi link.
This revised version guarantees compatibility with the unique features of Scikit-longitudinal
.
Nevertheless, conflicts may occur with other dependencies in Auto-Sklong
that also require Scikit-Learn
.
Follow these steps to prevent any issues when running your project.
We improved @PGijsbers' open-source GAMA
initiative to propose a new search space that
leverages our other newly-designed library
Scikit-Longitudinal
(Sklong) in order to tackle the longitudinal
classification problems via Combined Algorithm Selection and Hyperparameter Optimization (CASH Optimization).
Worth noting that it previously was not possible with GAMA
or any other AutoML libraries to the best of our knowledge
(refer to the Related Projects in the
official documentation nonetheless).
While GAMA
is offering a way to update the search space, we had to improve GAMA
to support a couple of new features as follow.
Nonetheless, it is worth-noting that in the coming months, the current version of Auto-Sklong
might speedy increase due
to the following pull requests ongoing on GAMA
:
As soon as we are able to publish those on GAMA
, there will be a compatibility refactoring to align
Auto-Sklong
with the most recent version of GAMA
. As a result, this section will be removed appropriately.
For developers looking to contribute, please refer to the Contributing
section of GAMA
here
and Scikit-Longitudinal
here.
Auto-Sklong
is compatible with the following operating systems:
To perform AutoML on your longitudinal analysis with Auto-Sklong
, use the following two-easy-steps.
First, load and prepare your dataset using the LongitudinalDataset
class of
Sklong
.
Second, use the GamaLongitudinalClassifier
class of Auto-Sklong
.
Following instantiating it set up its hyperparameters
or let default, you can apply the popular
fit, predict, _prodictproba, methods in the same way that Scikit-learn
does, as shown in the example below. It will then automatically search for the best model and hyperparameters for your dataset.
Refer to the documentation for more information on the GamaLongitudinalClassifier
class.
from sklearn.metrics import classification_report
from scikit_longitudinal.data_preparation import LongitudinalDataset
from gama.GamaLongitudinalClassifier import GamaLongitudinalClassifier
# Load your longitudinal dataset
dataset = LongitudinalDataset('./stroke.csv')
dataset.load_data_target_train_test_split(
target_column="class_stroke_wave_4",
)
# Pre-set or manually set your temporal dependencies
dataset.setup_features_group(input_data="elsa")
# Instantiate the AutoML system
automl = GamaLongitudinalClassifier(
features_group=dataset.features_group(),
non_longitudinal_features=dataset.non_longitudinal_features(),
feature_list_names=dataset.data.columns,
)
# Run the AutoML system to find the best model and hyperparameters
model.fit(dataset.X_train, dataset.y_train)
# Predictions and prediction probabilities
label_predictions = automl.predict(X_test)
probability_predictions = automl.predict_proba(X_test)
# Classification report
print(classification_report(y_test, label_predictions))
# Export a reproducible script of the champion model
automl.export_script()
Paper has been submitted to a conference. In the meantime, for the repository, utilise the button top right corner of the repository "How to cite?", or open the following citation file: CITATION.cff.