Tongdongq/darwin-gpu - Githubissues

This repository contains a GPU implementation of Darwin [1][2], a hardware-friendly DNA aligner. It consists of two parts: D-SOFT and GACT, which represent typical seed-and-extend methods. D-SOFT (Diagonal-band based Seed Overlapping based Filtration Technique) filters the search space by counting non-overlapping bases in matching Kmers in a band of diagonals. GACT (Genomic Alignment using Constant Tracebackmemory) can align reads of arbitrary length using constant memory for the compute-intensive step.

This implementation can be used to run on CPU only, or use the GPU-accelerated version. For more choices between individual optimizations, go back to commit e472745e. Compile for the CPU with './z_compile.sh', or './z_compile.sh GPU' for the GPU version. Other compile options are 'TIME', which measures the CPU and GPU time during GACT for the GPU version, and 'NOSCORE', which removes the score calculation, all overlaps will have a reported score of 0 in this case.

To allow a more flexible substitution matrix, put back the 'gact_sub_mat' variable in darwin.cpp.

Usage: ./darwin .fasta .fasta [CPU_THREADS NUM_BLOCKS THREADS_PER_BLOCK] Reference and reads files should be the same if used for de novo alignment. CPU_THREADS is the number of CPU threads. NUM_BLOCKS is the number of GPU blocks each CPU thread launches, ignored when only using CPU. THREADS_PER_BLOCK is the number of GPU threads per GPU block, ignored when only using CPU.

For 50MB of PacBio human data, taken from the 54x dataset, 8 32 64 was found to be the best run configuration. The included reads.fasta is a 10x E.coli dataset, generated by PBSIM. The origin in the genome and readlength are put in the name, these are used by the measurement_sensitivity_PBSIM script.

The Makefile assumes Compute Capability 3.5.

Typical run: ./z_compile.sh GPU ./run.sh 8 32 64 cat darwin.*.out | sort | uniq > out.darwin ./measure_sensitivity_PBSIM.py

[1] Darwin: A Hardware-acceleration Framework for Genomic Sequence Alignment https://www.biorxiv.org/content/early/2017/01/24/092171

[2] Darwin: A Genomics Co-processor Provides up to 15,000X Acceleration on Long Read Assembly https://dl.acm.org/citation.cfm?id=3173193

Tongdongq / darwin-gpu

readme