mcvickerlab / WASP2

Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis
WASP2: Allele-specific pipeline for unbiased read mapping and allelic-imbalance analysis



Recommended installation through conda, and given environment

conda env create -f environment.yml


Allelic Imbalance Analysis

Analysis pipeline currently consists of two tools (Count and Analysis)


Count Tool

Process allele specific read counts per SNP.\ Sample names can be provided in order to filter out non-heterozygous SNPs. Genes and ATAC-seq peaks can also be provided to include SNPs that overlap regions of interest.\ Providing samples and regions is highly recommended for allelic-imbalance analysis


python WASP2/src/counting count-variants [BAM] [VCF] {OPTIONS}

Analysis Tool

Analyzes Allelic Imbalance per ATAC peak given allelic count data


python WASP2/src/analysis find-imbalance [COUNTS] {OPTIONS}

Unbiased Allele-Specific Read Mapping

Mappability filtering pipeline for correcting allelic mapping biases.\ First, reads are mapped normally using a mapper chosen by the user (output as BAM). Then mapped reads that overlap single nucleotide polymorphisms (SNPs) are identified. For each read that overlaps a SNP, its genotype is swapped with that of the other allele and the read is re-mapped. Re-mapped reads that fail to map to exactly the same location in the genome are discarded.

Step 1: Create Reads for Remapping

This step identifies reads that overlap snps and creates reads with swapped alleles.


python WASP2/src/mapping make-reads [BAM] [VCF] {OPTIONS}

Step 2: Remap Reads

Remap fastq reads using mapping software of choice


bwa mem -M "BWAIndex/genome.fa" "${prefix}_swapped_alleles_r1.fq" "${prefix}_swapped_alleles_r2.fq" | samtools view -S -b -h -F 4 - > "${prefix}_remapped.bam"
samtools sort -o "${prefix}_remapped.bam" "${prefix}_remapped.bam"
samtools index "${prefix}_remapped.bam"

Step 3: Filter Reads that Fail to Remap

Identify and remove reads that failed to remap to the same position. Creates allelic-unbiased bam file


python WASP2/src/mapping filter-remapped "${prefix}_remapped.bam" --json "${prefix}_wasp_data_files.json"


python WASP2/src/mapping filter-remapped "${prefix}_remapped.bam" "${prefix}_to_remap.bam" "${prefix}_keep.bam"

Single-Cell Allelic Counts

Process allele specific read counts for single-cell datasets.\ Output counts as anndata containing cell x SNP count matrix.


python WASP2/src/counting count-variants-sc [BAM] [VCF] [BARCODES] {OPTIONS}

Single-Cell Allelic Imbalance

Estimate allele-specific chromatin acccessibility using single-cell allelic counts.\ Allelic-Imbalance is estimated on a per-celltype basis.


python WASP2/src/counting find-imbalance-sc [COUNTS] [BARCODE_MAP] {OPTIONS}

Single-Cell Comparative Imbalance

Compare differential allelic-imbalance between celltypes/groups.


python WASP2/src/counting compare-imbalance [COUNTS] [BARCODE_MAP] {OPTIONS}

