arnor-sigurdsson / EIR

A toolkit for training deep learning models on genotype, tabular, sequence, image, array and binary data.
https://eir.readthedocs.io/
GNU Affero General Public License v3.0
24 stars 5 forks source link
deep-learning machine-learning python

Documentation Status


Supervised modelling, sequence generation, image generation and array output on genotype, tabular, sequence, image, array, and binary input data.

WARNING: This project is in alpha phase. Expect backwards incompatible changes and API changes.

Table of Contents

  1. Install
  2. Usage
  3. Use Cases
  4. Features
  5. Supported Inputs and Outputs
  6. Related Projects
  7. Citation
  8. Acknowledgements

Install

Installing EIR via pip

pip install eir-dl

Important: The latest version of EIR supports Python 3.12. Using an older version of Python will install a outdated version of EIR, which likely be incompatible with the current documentation and might contain bugs. Please ensure that you are installing EIR in a Python 3.12 environment.

Installing EIR via Container Engine

Here's an example with Docker:

docker build -t eir:latest https://raw.githubusercontent.com/arnor-sigurdsson/EIR/master/Dockerfile
docker run -d --name eir_container eir:latest
docker exec -it eir_container bash

Usage

Please refer to the Documentation for examples and information.

Use Cases

EIR allows for training and evaluating various deep-learning models directly from the command line. This can be useful for:

If you are an ML/DL researcher developing new models, etc., it might not fit your use case. However, it might provide a quick baseline for comparison to the cool stuff you are developing, and there is some degree of customization possible.

Features

Supported Inputs and Outputs

Modality Input Output
Genotype x
Tabular x x
Sequence x x
Image x x
Array x x
Binary x

† While not directly supported, genotypes can be treated as arrays. For example see the MNIST Digit Generation tutorial.

Related Projects

Citation

If you use EIR in a scientific publication, we would appreciate if you could use one of the following citations:

@article{10.1093/nar/gkad373,
    author    = {Sigurdsson, Arn{\'o}r I and Louloudis, Ioannis and Banasik, Karina and Westergaard, David and Winther, Ole and Lund, Ole and Ostrowski, Sisse Rye and Erikstrup, Christian and Pedersen, Ole Birger Vesterager and Nyegaard, Mette and DBDS Genomic Consortium and Brunak, S{\o}ren and Vilhj{\'a}lmsson, Bjarni J and Rasmussen, Simon},
    title     = {{Deep integrative models for large-scale human genomics}},
    journal   = {Nucleic Acids Research},
    month     = {05},
    year      = {2023}
}

@article{sigurdsson2022improved,
    author    = {Sigurdsson, Arnor Ingi and Ravn, Kirstine and Winther, Ole and Lund, Ole and Brunak, S{\o}ren and Vilhjalmsson, Bjarni J and Rasmussen, Simon},
    title     = {Improved prediction of blood biomarkers using deep learning},
    journal   = {medRxiv},
    pages     = {2022--10},
    year      = {2022},
    publisher = {Cold Spring Harbor Laboratory Press}
}

Acknowledgements

Massive thanks to everyone publishing and developing the packages this project directly and indirectly depends on.