timbitz / Whippet.jl

Lightweight and Fast; RNA-seq quantification at the event-level
MIT License
103 stars 21 forks source link

segfault using whippet-quant.jl #125

Closed tsgouros closed 2 years ago

tsgouros commented 2 years ago

When I use --mismatches 10 or --mismatches 20, I get a segfault in reads.jl. That is, it doesn't happen at the same point of execution when I operate on the same data. But it happens on accesses to fwd_aln. I have modified read.jl to write a fastq file of data from a specific gene (see #121 ) but that part seems to work fine. If I don't specify the --mismatches option, it runs fine.

You can see my modified source file at https://github.com/tsgouros/Whippet.jl/blob/count-errors/src/reads.jl The error reports happening either on line 158 (original code) or sometimes on line 173 (my code).

The problem is in process_paired_reads(). I did something similar to this with unpaired reads a couple of months ago and had no such problem. This is julia 1.6.2, on linux, 4.18.0.

A traceback is below. Any ideas?

signal (11): Segmentation fault
in expression starting at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/bin/whippet-quant.jl:185
gc_mark_loop at /buildworker/worker/package_linux64/build/src/gc.c:2521
_jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:3033
jl_gc_collect at /buildworker/worker/package_linux64/build/src/gc.c:3240
maybe_collect at /buildworker/worker/package_linux64/build/src/gc.c:879 [inlined]
jl_gc_pool_alloc at /buildworker/worker/package_linux64/build/src/gc.c:1203
unknown function (ip: 0x7fa958fa4683)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
#ungapped_rev_extend#47 at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/align.jl:509
ungapped_rev_extend##kw at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/align.jl:427 [inlined]
#_ungapped_align#43 at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/align.jl:219 [inlined]
_ungapped_align##kw at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/align.jl:212
#ungapped_align#61 at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/paired.jl:80
ungapped_align at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/paired.jl:44 [inlined]
#process_paired_reads!#60 at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/reads.jl:158
process_paired_reads!##kw at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/reads.jl:132
unknown function (ip: 0x7fa958fa0027)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
macro expansion at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/src/timer.jl:5 [inlined]
main at /home/tsgouros@lsmaster.lifespan.org/Whippet.jl/bin/whippet-quant.jl:143
unknown function (ip: 0x7fa958f4820c)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:115
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:204
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:435
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:670
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:877
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:825
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:929
eval at ./boot.jl:360 [inlined]
include_string at ./loading.jl:1116
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
_include at ./loading.jl:1170
include at ./Base.jl:386
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
exec_options at ./client.jl:285
_start at ./client.jl:485
jfptr__start_34281.clone_1 at /usr/bin/julia-1.6.2/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
true_main at /buildworker/worker/package_linux64/build/src/jlapi.c:560
repl_entrypoint at /buildworker/worker/package_linux64/build/src/jlapi.c:702
main at julia (unknown line)
__libc_start_main at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x4007d8)
Allocations: 4870159087 (Pool: 4870130175; Big: 28912); GC: 254
timbitz commented 2 years ago

Hey @tsgouros, I was actually a little perplexed at first. Segfaults in julia should be nonexistant/rare-- usually for me it means the cluster killed my job for using too much memory. but If you look at the code I wrote, there is an @inbounds macro which is removing normal bounds checking for the sake of efficiency. I probably felt sure that all the code in that block was going to be inbounds. You however, are working within that block and likely trying to access an array index based on mismatch # that doesn't exist-- Can you try removing the macro and re-run? My guess is you'll get an out of bounds error.

tsgouros commented 2 years ago

Your guess seems to be correct, thank you. But apparently if I specify "--mismatches 20" on the command line, apparently the mismatch field of the score object sometimes comes back as 21. Not sure what that would imply.

timbitz commented 2 years ago

Ahh yeah, I think it implies that I set the alignment breaking criteria as mismatches > threshold, so 21 > 20 ends the alignment and returns.

tsgouros commented 2 years ago

Aha! Thanks.