pagnani / ArDCA.jl

Autoregressive networks for protein
MIT License
33 stars 8 forks source link

ArDCA

Dev Build Status Coverage License: MIT

Autoregressive protein model learning through generalized logistic regression in Julia.

Overview

The authors of this code are Jeanne Trinquier, Guido Uguzzoni, Andrea Pagnani, Francesco Zamponi, and Martin Weigt.

See also this Wikipedia article article for a general overview of the Direct Coupling Analysis technique.

The code is written in Julia.

Install

This is a registered package: to install enter ] in the repl and

pkg> add ArDCA 

Notebooks

There are two jupyter notebooks (Python, and Julia) to help using the Package.

The tutorial.ipynb is for the julia version. The arDCA_sklearn.ipynb is for the python version.

Data

Data for five protein families (PF00014,PF00072, PF00076,PF00595,PF13354) are contained in the companion ArDCAData package.

For didactic reasons we include locally in the data folder, the PF00014 dataset.

Requirements

The minimal Julia version to run this code is 1.5. To run it in parallel using Julia multicore infrastructure, start julia with

$> julia -t numcores # ncores can be as large as your available number of threads

Documentation

Development version

License

This project is covered under the MIT License.