COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
779 stars 165 forks source link

Seg Fault in salmon quant #876

Closed alexdhill closed 1 year ago

alexdhill commented 1 year ago

salmon (bulk mode)

Describe the bug During salmon quant call, there is a segmentation fault

To Reproduce Steps and data to reproduce the behavior:

Specifically, please provide at least the following information:

Expected behavior A clear and concise description of what you expected to happen.

Salmon quant to produce quant.sf file.

Screenshots If applicable, add screenshots or terminal output to help explain your problem.

Version Info: ### PLEASE UPGRADE SALMON ###
### A newer version of salmon with important bug fixes and improvements is available. ####
###
The newest version, available at https://github.com/COMBINE-lab/salmon/releases
contains new features, improvements, and bug fixes; please upgrade at your
earliest convenience.
###
Sign up for the salmon mailing list to hear about new versions, features and updates at:
https://oceangenomics.com/subscribe
### salmon (selective-alignment-based) v1.9.0
### [ program ] => salmon 
### [ command ] => quant 
### [ index ] => { references/salmon/sel.align.gencode.v39.ucsc.rmsk.salmon.v1.9.0.sidx/ }
### [ libType ] => { A }
### [ mates1 ] => { SRR14506785_output_forward_paired.fq.gz }
### [ mates2 ] => { SRR14506785_output_reverse_paired.fq.gz }
### [ threads ] => { 8 }
### [ validateMappings ] => { }
### [ gcBias ] => { }
### [ seqBias ] => { }
### [ recoverOrphans ] => { }
### [ rangeFactorizationBins ] => { 4 }
### [ output ] => { SRR14506785.salmon.rmsk.out }
### [ writeUnmappedNames ] => { }
Logs will be written to SRR14506785.salmon.rmsk.out/logs
[2023-09-28 04:51:02.450] [jointLog] [info] setting maxHashResizeThreads to 8
[2023-09-28 04:51:02.450] [jointLog] [info] Fragment incompatibility prior below threshold.  Incompatible fragments will be ignored.
[2023-09-28 04:51:02.450] [jointLog] [info] Usage of --validateMappings implies use of minScoreFraction. Since not explicitly specified, it is being set to 0.65
[2023-09-28 04:51:02.450] [jointLog] [info] Setting consensusSlack to selective-alignment default of 0.35.
[2023-09-28 04:51:02.450] [jointLog] [info] parsing read library format
[2023-09-28 04:51:02.450] [jointLog] [info] There is 1 library.
[2023-09-28 04:51:02.450] [jointLog] [info] Loading pufferfish index
[2023-09-28 04:51:02.451] [jointLog] [info] Loading dense pufferfish index.
-----------------------------------------
| Loading contig table | Time = 31.648 s
-----------------------------------------
size = 45110164
-----------------------------------------
| Loading contig offsets | Time = 96.211 ms
-----------------------------------------
-----------------------------------------
| Loading reference lengths | Time = 9.7567 ms
-----------------------------------------
-----------------------------------------
| Loading mphf table | Time = 754.87 ms
-----------------------------------------
size = 4016010494
Number of ones: 45110163
Number of ones per inventory item: 512
Inventory entries filled: 88106
-----------------------------------------
| Loading contig boundaries | Time = 5.7049 s
-----------------------------------------
size = 4016010494
-----------------------------------------
| Loading sequence | Time = 554.02 ms
-----------------------------------------
size = 2662705604
-----------------------------------------
| Loading positions | Time = 6.1033 s
-----------------------------------------
size = 5024146461
-----------------------------------------
| Loading reference sequence | Time = 658.08 ms
-----------------------------------------
-----------------------------------------
| Loading reference accumulative lengths | Time = 18.506 ms
-----------------------------------------
[2023-09-28 04:51:48.011] [jointLog] [info] done
[2023-09-28 04:51:48.061] [jointLog] [info] Index contained 5352508 targets
[2023-09-28 04:52:00.269] [jointLog] [info] Number of decoys : 182
[2023-09-28 04:52:00.269] [jointLog] [info] First decoy index : 5155176 

[2023-09-28 04:52:03.534] [jointLog] [info] Automatically detected most likely library type as ISR
processed 26000000 fragments
hits: 42435888, hits per frag:  1.63223/.../work2/c3/593743a22569a97e1d10b2a200b713/.command.sh: line 4:    38 Segmentation fault      (core dumped) /usr/local/bin/salmon quant -i references/salmon/*ucsc.rmsk.salmon*/ --libType A -1 SRR14506785_output_forward_paired.fq.gz -2 SRR14506785_output_reverse_paired.fq.gz -p 8 --validateMappings --gcBias --seqBias --recoverOrphans --rangeFactorizationBins 4 --output SRR14506785.salmon.rmsk.out --writeUnmappedNames

Desktop (please complete the following information):

Additional context This is a very recent issue, it seems to be resolved by version 1.10 but I have no reads into where the issue may be occuring, or why it is so inconsistent.

rob-p commented 1 year ago

Thanks for reporting this @alexdhill. There was a bug addressed in version 1.10 (the first bug in the release notes here) that could be related to this. If you do encounter this in any samples under 1.10, please let us know. In which case, keeping track of the offending sample might be the most useful way to try and dig into it further.

Thanks! Rob

alexdhill commented 1 year ago

Good to hear, I checked the release log but I wasn't able to confirm whether I was using the bugged conda build since we are using docker biocontainers (build v1.9.0--h7e5ed60_1). I'll upgrade our pipeline and close the issue after a new run of the same data if the problem seems to be resolved.

Best, Alex