timbitz / Whippet.jl

Lightweight and Fast; RNA-seq quantification at the event-level
MIT License
105 stars 21 forks source link

De novo indexing using `--bam` capability #69

Closed timbitz closed 6 years ago

timbitz commented 6 years ago

This is a fairly major pull request with changes primarily affecting Whippet's indexing capabilities. This branch enables a user to supplement standard-annotation files with more sources of splice-sites for Whippet to utilize when indexing/building CSG nodes. These changes allow Whippet to produce more comprehensive CSGs in poorly annotated species, utilizing unannotated splice-sites (and therefore also unannotated exons) from read alignments in a pre-existing BAM file created by another program with novel spliced-read alignment abilities.

Specific changes are:

Prior to merging, testing the PR should be possible with:

julia> Pkg.update()
julia> Pkg.checkout("Whippet", "de-novo-index")

Sorted and Indexed BAM files can be created with samtools as follows

$ samtools sort filename.bam filename.sort
$ samtools rmdup -S filename.sort.bam filename.sort.rmdup.bam
$ samtools index filename.sort.rmdup.bam
$ ls filename.sort.rmdup.bam*
filename.sort.rmdup.bam        filename.sort.rmdup.bam.bai

Then build an index but with the additional --bam filename.sort.rmdup.bam parameter, which also looks for filename.sort.rmdup.bam.bai

codecov[bot] commented 6 years ago

Codecov Report

Merging #69 into master will increase coverage by 4.52%. The diff coverage is 75.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #69      +/-   ##
==========================================
+ Coverage   54.27%   58.79%   +4.52%     
==========================================
  Files          18       19       +1     
  Lines        2320     2378      +58     
==========================================
+ Hits         1259     1398     +139     
+ Misses       1061      980      -81
Impacted Files Coverage Δ
src/edges.jl 72.58% <ø> (ø) :arrow_up:
src/index.jl 10.29% <0%> (-0.64%) :arrow_down:
src/quant.jl 59.32% <0%> (+2.62%) :arrow_up:
src/graph.jl 93.69% <100%> (+0.18%) :arrow_up:
src/types.jl 91.67% <100%> (+4.17%) :arrow_up:
src/align.jl 71.63% <28.57%> (-0.5%) :arrow_down:
src/io.jl 59.24% <72.5%> (+26.18%) :arrow_up:
src/bam.jl 92.31% <92.31%> (ø)
src/refset.jl 92% <94.74%> (-1.37%) :arrow_down:
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update bed4f75...a7744b4. Read the comment docs.

timbitz commented 6 years ago