PAVFinder is a Python package that detects structural variants from de novo assemblies (e.g. RNA-Bloom, ABySS, Trans-ABySS). As such, it is able to analyse both genome and transcriptome assemblies:
pavfinder genome
pavfinder fusion
pavfinder splice
PAVFinder infers variants from non-contiguous (split or gapped) contig sequence alignments to the reference genome. Assemblies can be aligned to the reference genome (c2g
alignment) using bwa mem(genome) or gmap(transcriptome). Read support for events can be gathered by aligning reads to the assembly using bwa mem (r2c
alignment).
We provide a Targeted-Assembly-Pipeline, TAP
, to facilitate transcriptome analysis on selected genes. This requires a multi-index Bloom Filter of targeted gene sequences to be created beforehand. Whereas whole transcriptome analysis with over 100 million read pairs can take more than 24 hours, a targeted analysis of several hundred genes (e.g. COSMIC) can be completed within half an hour. TAP
uses Trans-ABySS for transcriptome assembly. TAP2
is the successor of TAP
and it uses RNA-Bloom for transcriptome assembly.
We also provide a pipeline for gene fusion detection in RNA-seq data, Fusion-Bloom
, which couples PAVFinder with RNA-Bloom. We demonstrated that it has higher senstivitiy and specificity than most state-of-the-art fusion callers.
See INSTALL.md
See USAGE.md
See pavfinder/test for a small dataset to test our transcriptome (TAP
, TAP2
, and Fusion-Bloom
) and genome workflows.
Readman Chiu, Ka Ming Nip, Justin Chu and Inanc Birol. TAP: a targeted clinical genomics pipeline for detecting transcript variants using RNA-seq data. BMC Med Genomics (2018) 11:79 https://doi.org/10.1186/s12920-018-0402-6
Readman Chiu, Ka Ming Nip, Inanc Birol. Fusion-Bloom: fusion detection in assembled transcriptomes. Bioinformatics (2019) btz902 https://doi.org/10.1093/bioinformatics/btz902