timbitz / Whippet.jl

Lightweight and Fast; RNA-seq quantification at the event-level
MIT License
105 stars 21 forks source link

Error while including unannotated splice-sites #151

Open ldenti opened 1 month ago

ldenti commented 1 month ago

Hi, I am trying to include unannotated splice-sites in the index but whippet keeps crashing. I prepared the bam file as suggested on the README (I aligned the reads with STAR).

This the command I'm using:

julia whippet-index.jl --fasta reference.fa --gtf annotation.gtf --index whippet-index --bam alignments.bam

Here the log:

Whippet v1.6.2 loading... 
  Activating project at `/home/denti/software/whippet`
 22.716702 seconds.
Loading GTF file: /home/denti/tmp/annotation.gtf
Loading BAM file for random-access: /home/denti/tmp/alignments.bam
ERROR: LoadError: BoundsError(BGZFStreams.Block[BGZFStreams.Block(UInt8[0x1f, 0x8b, 0x08, ..., 0x00, 0x00, 0x00], 13212492, 1, 0, CodecZlib.ZStream(Ptr{UInt8} @0x0000557c4510d09c, 0x00000000, 0x0000000000000000, Ptr{UInt8} @0x0000557c4220c180, 0x00010000, 0x0000000000000000, Ptr{UInt8} @0x0000000000000000, Ptr{Nothing} @0x0000557c432de960, Ptr{Nothing} @0x00007f8ae5038960, Ptr{Nothing} @0x00007f8ae5038970, Ptr{Nothing} @0x0000000000000000, 64, 0x0000000000000001, 0x0000000000000000))], (2,))
Stacktrace:
 [1] error(s::BoundsError)
   @ Base ./error.jl:44
 [2] process_records!(reader::XAM.BAM.Reader{IOStream}, seqname::String, range::UnitRange{Int64}, strand::Bool, exons::IntervalTrees.IntervalTree{UInt32, IntervalTrees.Interval{UInt32}}, known::Vector{Union{}}, oneknown::Bool, novelacc::Dict{UInt32, Int64}, noveldon::Dict{UInt32, Int64})
   @ Whippet /home/denti/software/whippet/src/bam.jl:22
 [3] load_gtf(fh::IOStream; txbool::Bool, suppress::Bool, usebam::Bool, bamreader::Nullable{XAM.BAM.Reader{IOStream}}, bamreads::Int64, bamoneknown::Bool)
   @ Whippet /home/denti/software/whippet/src/refset.jl:236
 [4] macro expansion
   @ /home/denti/software/whippet/src/timer.jl:5 [inlined]
 [5] main()
   @ Main /home/denti/software/whippet/bin/whippet-index.jl:85
 [6] top-level scope
   @ /home/denti/software/whippet/src/timer.jl:5
in expression starting at /home/denti/software/whippet/bin/whippet-index.jl:108

caused by: BoundsError: attempt to access 1-element Vector{BGZFStreams.Block} at index [2]
Stacktrace:
 [1] getindex
   @ ./array.jl:924 [inlined]
 [2] virtualoffset(stream::BGZFStreams.BGZFStream{IOStream})
   @ BGZFStreams ~/.snakemake/conda/a63ba2d19ff9079ced71d8a647183878_/share/julia/packages/BGZFStreams/qApsr/src/bgzfstream.jl:156
 [3] iterate(iter::XAM.BAM.OverlapIterator{IOStream}, state::XAM.BAM.OverlapIteratorState)
   @ XAM.BAM ~/.snakemake/conda/a63ba2d19ff9079ced71d8a647183878_/share/julia/packages/XAM/9nk3g/src/bam/overlap.jl:65
 [4] process_records!(reader::XAM.BAM.Reader{IOStream}, seqname::String, range::UnitRange{Int64}, strand::Bool, exons::IntervalTrees.IntervalTree{UInt32, IntervalTrees.Interval{UInt32}}, known::Vector{Union{}}, oneknown::Bool, novelacc::Dict{UInt32, Int64}, noveldon::Dict{UInt32, Int64})
   @ Whippet /home/denti/software/whippet/src/bam.jl:20
 [5] load_gtf(fh::IOStream; txbool::Bool, suppress::Bool, usebam::Bool, bamreader::Nullable{XAM.BAM.Reader{IOStream}}, bamreads::Int64, bamoneknown::Bool)
   @ Whippet /home/denti/software/whippet/src/refset.jl:236
 [6] macro expansion
   @ /home/denti/software/whippet/src/timer.jl:5 [inlined]
 [7] main()
   @ Main /home/denti/software/whippet/bin/whippet-index.jl:85
 [8] top-level scope
   @ /home/denti/software/whippet/src/timer.jl:5

Here the full log with the full array: whippet.log.

I cannot upload the data here since the zip is too big but I uploaded them on drive (here the link)

If I remove the --bam argument, it works.

Thanks, Luca