Cellij (pronounced as "zillīj", derived from Zellij: a style of mosaic tilework made from individually hand-chiseled tile pieces) is a versatile factor analysis framework for rapidly building and training a wide range of factor analysis models on multi-omics data. Cellij builds upon a Bayesian factor analysis skeleton that is designed to provide a wide-ranging customisability at all levels, ranging from likelihoods and optimisation procedures to sparsity-inducing priors.
Cellij is designed for rapid prototyping of custom factor analysis models, allowing users to efficiently define new models in an iterative fashion. The following code snippet shows an example how to setup and train a model with a predefined sparsity prior.
mdata = cellij.Importer().load_CLL()
# 1. We create a new Factor Analysis model
model = cellij.FactorModel(n_factors=10)
# 2. We add an MuData object to the model
model.add_data(mdata)
# 3. We can add some options if we wish
model.set_model_options(
weight_priors={
"drugs": "Horseshoe",
"methylation": "Horseshoe",
"mrna": "Horseshoe",
},
)
# 4. We train the model
model.fit(epochs=10000)
For basic tutorials on real-world data, please have a look at our notebook repository.
Cellij is a batteries included framework:
Please refer to the documentation. In particular, the
You need to have Python 3.8 or newer installed on your system. If you don't have Python installed, we recommend installing Mambaforge.
There are several alternative options to install cellij:
pip install git+https://github.com/bioFAM/cellij.git@main
See the changelog.
We appreciate all contributions. If you found a bug, feel free to contribute back without any further discussion.
If you intend to introduce novel features, utility functions, or extensions to the core, we kindly request that you initiate a discussion by opening an issue. Prior dialogue allows us to align the proposed changes with our current development direction. Submitting a pull request without prior discussion could potentially lead to rejection, as it may not align with the core's intended direction, which you may not be aware of.
Cellij has a BSD-style license, as found in the LICENSE file.
If you use Cellij, please consider citing:
@proceedings{rohbeckcellij,
author = {Rohbeck, Martin and Qoku, Arber and Treis, Tim and Theis, Fabian J and Velten, Britta and Buettner, Florian and Stegle, Oliver},
title = {Cellij: A Modular Factor Model Framework for Interpretable and Accelerated Multi-Omics Data Integration},
series = {ICML Workshop on Computational Biology},
year = {2023},
url = {https://icml-compbio.github.io/2023/papers/WCBICML2023_paper124.pdf}
}