andersen-lab / Freyja

Depth-weighted De-Mixing
BSD 2-Clause "Simplified" License
105 stars 30 forks source link

Freyja

freyja CI Anaconda-Server Badge docs

Detailed documentation, including installation, usage, and examples can be found here.

Freyja is a tool to recover relative lineage abundances from mixed SARS-CoV-2 samples from a sequencing dataset (BAM aligned to the Hu-1 reference). The method uses lineage-determining mutational "barcodes" derived from the UShER global phylogenetic tree as a basis set to solve the constrained (unit sum, non-negative) de-mixing problem.

Freyja is intended as a post-processing step after primer trimming and variant calling in iVar (Grubaugh and Gangavaparu et al., 2019). From measurements of SNV freqency and sequencing depth at each position in the genome, Freyja returns an estimate of the true lineage abundances in the sample.

To ensure reproducibility of results, we provide old (timestamped) barcodes and metadata in the separate Freyja-data repository. Barcode version can be checked using the freyja demix --version command.

NOTE: Freyja barcodes are now stored in compressed feather format, as the initial csv barcode file became too large. Specific lineage definitions are now provided in here.

Installation via conda

Freyja is entirely written in Python 3, but requires preprocessing by tools like iVar and samtools mpileup to generate the required input data. We recommend using python3.10 to take advantage of the Clarabel solver, but Freyja has been tested on python versions from 3.7 to 3.10. First, create an environment for freyja

conda create -n freyja-env

then add the following channels

conda config --add channels bioconda
conda config --add channels conda-forge

and then install freyja

conda install freyja