Overview:
LoMA is a localized assembly tool for long reads.
It starts with interpretations of the all-to-all alignment to a region of interest. LoMA lays out long reads after some filterations. It divides the layout into multiple blocks to make partial consensus sequences. They will be contatenated into one consensus sequence.
Note:
The current version is optimized for ONT reads although it can also be used for PacBio data.
The following instructions assume users of linux-like OS. Users who use Windows are recommended to use WSL to run commands.
LoMA needs minimap2 (Heng Li) and MAFFT (Katoh et al.), so users please install them beforehand.
Typically, LoMA takes 10-1000 reads.
fastq file
fasta file (.cs)
Users can download all source files from "Releases". Please download the latest version v1.1.3.
Then decompress and go to the directory loma.
$ cd loma
(case 1) You can run LoMA after executing SETUP.sh, which is for the establishment of the path.
$ sh SETUP.sh
Now you are ready to use the tool. For general usage for reconstructions of localized genomic regions, a user can run LoMA by:
$ loma -I <INPUT> -O <OUTPUT> -H <minimap2> -K <mafft>
(case 2) You can run LoMA without using SETUP.sh just by running the sh file.
$ sh loma -I <INPUT> -O <OUTPUT> -H <minimap2> -K <mafft>
INPUT is a directory designated by a user, which is supposed to include fastq file(s) from localized regions. Please make sure that INPUT is an absolute path.
OUTPUT is also a directory defined by a user and will have three directories newly made; CONSENSUS, dir1, dir2. Final CSs are put in CONSENSUS directory with extension of .cs (fasta). Please make sure that OUTPUT is an absolute path as well.
H and K are not necessary if their paths are reachable.
LoMA runs with the following parameters:
-I:
-O:
-b:
-s:
-h:
-d:
-l: <ont/pb> Data. Nanopore (ONT) or PacBio. (default=ont)
-c:
-r:
-m:
-H:
-K:
python >= 3.8
minimap2 >= ver.2.0
MAFFT >= ver.7
numpy (python library)
matplotlib (python library)
Ikemoto, K., Fujimoto, H. & Fujimoto, A. Localized assembly for long reads enables genome-wide analysis of repetitive regions at single-base resolution in human genomes. Hum Genomics 17, 21 (2023). https://doi.org/10.1186/s40246-023-00467-7