Manatee

Manatee version 1.3

What is Manatee?

Manatee is a tool for detection, quantification, and analysis of small ncRNAs 
from next-generation sequencing data.

DEPENDENCIES

perl
Set::IntervalTree: perl package
SAMtools: need to be installed and added to your PATH
Bowtie: executable file included in Manatee package, no installation required

INSTALLATION (Unix/Linux)

Install the required dependencies and execute Manatee main script as described in the usage section.

Set::IntervalTree

cpan

install Set::IntervalTree

PACKAGE FILES

The following compontents are included in the Manatee package.

bowtie-1.0.1       % directory with bowtie aligner

config             % configuration file

Manatee            % Perl core program for sRNA analysis

README.md          % this file

USAGE with configuration file

Syntax:

manatee -config <file> -i <file> -o <dir>

-config	Path to configuration file.
-i	Path to pre-processed FASTQ or FASTA file. Valid formats: .fa, .fasta, .fastq, .fq, .fa.gz, .fasta.gz, .fastq.gz, .fq.gz.
-o	Path to directory where the output will be stored.

USAGE with input parameters

Syntax:

manatee [OPTIONS] -i <file> -o <dir> -index <ebwt> -genome <file> -annotation <file>

-i	Path to pre-processed FASTQ or FASTA file. Valid formats: .fa, .fasta, .fastq, .fq, .fa.gz, .fasta.gz, .fastq.gz, .fq.gz.
-o	Path to directory where the output will be stored.
-index	Path and basename of the genome Bowtie index to be searched. The basename is the name of any of the index files up to but not including the final .1.ebwt/.rev.1.ebwt/etc.
-genome	Path to genome FA or FASTA file.
-annotation	Path to non coding annotation file. File should contain the following tab seperated elements: chromosome, strand, start loci, end loci, biotype, transcript id, transcript name.

OPTIONS

-t_index	Path and basename of the transcriptome Bowtie index to be searched. The basename is the name of any of the index files up to but not including the final .1.ebwt/.rev.1.ebwt/etc. If left blank, in case of non existing index, Manatee will generate transcriptome index based on the provided non coding annotation and will store that index within the transcripts directory.
-cores	Number of alignment cores (default: -cores 1).
-collapse	Collapse reads with the same genomic sequences. This setting reduces significantly the execution time. Possible values yes/no (default: -collapse yes).
-mismatches	Maximun number of mismatches in genomic alignments (default: mismatches=1).
-m	Max of multimapping loci, -m in bowtie execution. The mapping algorithm will be applied only for reads with multi-mapped loci less or equal than m. Reads with multimapped loci that exceed the -m will be aligned against transcriptome (default: -m 50).
-s	Strand specific mode of the algorithm (default -s yes).
-cd	Minimum number of unannotated read abundances per cluster (default: -cd 5).
-cdi	Clusters of unannotated reads will be merged if the distance between them is equal or less than cdi (default: -cdi 50).

OUTPUT

A successful run will produce the following three output files in the output directory

<*inputName>*_Manatee_counts.tsv

<*inputName>*_Manatee_clusters.tsv

<*inputName>*_Manatee_isomirs.tsv.

Depending on the input, <*inputName>*_Manatee_clusters.tsv might not be generated.

ADDITIONAL COMMENTS

Input data should be trimmed for adapters and barcodes before running Manatee. Too short reads and reads with low sequencing quality should be discarded from the input as well.
Example of annotation file in GTF format compatible with Manatee is included in the 'annotation' branch.
Genome and transcriptome Bowtie index files should be build using Bowtie 1. Bowtie 1 is included in the Manatee package.

FUNDING

The "ELIXIR-GR: Managing and Analysing Life Sciences Data (MIS: 5002780)". Project is co-financed by Greece and the European Union - European Regional Development Fund.

jehandzlik / Manatee

readme

Manatee

What is Manatee?

DEPENDENCIES

INSTALLATION (Unix/Linux)

Set::IntervalTree

PACKAGE FILES

USAGE with configuration file

Syntax:

USAGE with input parameters

Syntax:

OPTIONS

OUTPUT

ADDITIONAL COMMENTS

FUNDING