This script was written for long-read Oxford Nanopore Technologies direct RNA/cDNA sequencing data. It uses BAM files produced after mapping with minimap2 to the reference transcriptome. It will output a summary file and plots from your aligned reads. This script was used in: https://doi.org/10.1093/nar/gkab1129.
To obtain a BAM file align your FASTQ/FASTA files to the transcriptome with minimap2.
minimap2 -ax map-ont --sam-hit-only transcriptome.fasta reads.fastq | samtools view -bh > aligned_reads.bam
If minimap2 is not run with --sam-hit-only you should remove unmapped reads prior to running BamSlam to avoid slowing it down. You can also input a BAM file output from NanoCount: https://github.com/a-slide/NanoCount.
R (tested with v4.2.2) R packages:
Download/copy the Rscript from this repository and run it from the terminal as follows:
Rscript BamSlam.R [DATA_TYPE] [BAM_FILE] [OUT_PREFIX]
Rscript BamSlam.R rna undiff1_5Y.bam undiff1
DATA_TYPE, enter either: cdna, rna
BAM_FILE, a BAM file of alignments to the transcriptome
OUT_PREF, output file prefix
The script takes approximately 5 minutes per million reads.
Histogram of read coverages (full-length reads cutoff/dashed line = 0.95):
Histogram of known transcript lengths (per distinct transcript in the data):
Histogram of known transcript lengths (per read):
Histogram density plot of known transcript length vs coverage fractions:
Bar chart of the secondary alignments:
Density plot of the read accuracies: