ngless-toolkit / ngless

NGLess: NGS with less work
https://ngless.embl.de
Other
142 stars 24 forks source link
bioinformatics bioinformatics-pipeline bwa fastq fastq-format genomics haskell haskell-language metagenomics next-generation-sequencing ngs samtools science

NGLess: NGS Processing with Less Work

NGLess logo Ngless is a domain-specific language for NGS (next-generation sequencing data) processing.

Build & test MIT licensed Install with Bioconda Install with Bioconda Citation for NGLess

For questions and discussions, please use the NGLess mailing list.

If you are using NGLess, please cite:

NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language by Luis Pedro Coelho, Renato Alves, Paulo Monteiro, Jaime Huerta-Cepas, Ana Teresa Freitas, Peer Bork, Microbiome (2019) https://doi.org/10.1186/s40168-019-0684-8

NGLess cartoon

Example

ngless "1.5"
input = fastq(['ctrl1.fq','ctrl2.fq','stim1.fq','stim2.fq'])
input = preprocess(input) using |read|:
    read = read[5:]
    read = substrim(read, min_quality=26)
    if len(read) < 31:
        discard

mapped = map(input,
                reference='hg19')
write(count(mapped, features=['gene']),
        ofile='gene_counts.csv',
        format={csv})

For more information, check the docs. We also have a YouTube tutorial on how to use NGLess and SemiBin together (but you can learn to use NGLess independently of SemiBin).

Installing

See the install documentation for more information.

Bioconda

The recommended way to install NGLess is through bioconda:

conda install -c bioconda ngless 

Docker

Alternatively, a docker container with NGLess is available at docker hub:

docker run -v $PWD:/workdir -w /workdir -it nglesstoolkit/ngless:1.5.0 ngless --version

Adapt the mount flags (-v) as needed.

Linux

You can download a statically linked version of NGless 1.5.0

This should work across a wide range of Linux versions (please report any issues you encounter):

curl -L -O https://github.com/ngless-toolkit/ngless/releases/download/v1.5.0/NGLess-v1.5.0-Linux-static-full
chmod +x NGLess-v1.5.0-Linux-static-full
./NGLess-v1.5.0-Linux-static-full

This downloaded file bundles bwa, samtools and megahit (also statically linked).

From Source

Installing/compiling from source is also possible. Clone https://github.com/ngless-toolkit/ngless

Dependencies

The simplest way to get an environment with all the dependencies is to use conda:

conda create -n ngless
conda activate ngless
conda config --add channels conda-forge
conda install stack cairo bzip2 gmp zlib perl wget xz pkg-config make

You should have gcc installed (or another C-compiler).

The following sequence of commands should download and build the software

git clone https://github.com/ngless-toolkit/ngless
cd ngless
stack setup
make

To install, you can use the following command (replace <PREFIX> with the directory where you wish to install, default is /usr/local):

make make

Running Sample Test Scripts on Local Machine

For developers who have successfully compiled and installed NGless, running the test scripts in the tests folder would be the next line of action to have the output of sample test cases.

cd tests

Once in the tests directory, select any of the test folders to run NGless.

For example, here we would run the regression-fqgz test:

cd regression-fqgz
ngless ungzip.ngl

After running this script open the newly generated folder ungzip.ngl.output_ngless and view the template in the index.html file.

For developers who have done this much more datasets for testing purposes can be referenced and used by reading these documentation links: Human Gut Metagenomics Functional & Taxonomic Profiling Ocean Metagenomics Functional Profiling Ocean Metagenomics Assembly and Gene Prediction

More information

Authors