1. Prototype? What do you mean?
3. Compilation and installation.
4. Running mmp
mmp
is a proper mapper, we use it in the lab and you are more than
welcome to use it as well. When we say ‘prototype’, we mean that
our purpose was not really to secure a stable user base, but rather to
show that faithful mapping is achievable.
Below are the main things to consider before using mmp
:
1. It works only with single-end reads.
2. The whole read is aligned, there is no trimming.
3. There is no way to change the sensitivity.
Otherwise, mmp
is pretty good. The most important feature is that it
is approximately faithful, so you should be able to trust the mapping
quality. You will notice that some reads have a mapping quality over 150.
It is not a bug, we mean it: the location of those reads is extremely
likely to be correct.
Finally, mmp
is quite fast but it uses a little more memory than
many mainstream mappers.
You can find more information about mmp
in the
bioRxiv preprint
that describes the mapping strategy.
To install mmp
, open a terminal and clone this git repository:
git clone https://github.com/gui11aume/mmp
The files should be downloaded in a directory named mmp
. Go to this new
directory and use make
to compile:
cd mmp
make
This will create a binary file mmp
.
mmp
mmp
runs on Linux.
You first have to index a genome file before you can map reads in it. If
the genome of interest is saved in a file called genome.fasta
, you
index it with the following command:
./mmp --index genome.fasta
This creates a bunch of files that all start with genome.fasta.
. This
is the index.
Once the genome is indexed, you can start mapping reads in it. For this
you must know the approximate error rate of the sequencer. By default,
mmp
assumes that it is 1%.
If your reads are in a file called reads.fastq
, you map them with the
following command:
./mmp genome.fasta reads.fastq
If you want to use more than one core to speed up the mapping, you can
use the option -t
. For instance, if you want to use 4 cores, you would
use the following command:
./mmp -t 4 genome.fasta reads.fastq
This produces a sam file printed on stdout
. You can change the error
rate with the option -e
. For instance, if you know that it is 2%,
then the command should be:
./mmp -e 0.02 genome.fasta reads.fastq
mmp is also compatible with gzipped input files (requires zcat
in your $PATH
):
./mmp -e 0.02 genome.fasta reads.fastq.gz