brentp / smoove

structural variant calling and genotyping with existing tools, but, smoothly.
Apache License 2.0
222 stars 21 forks source link

smoove call running out of memory (65 G) with small WGS cohort #162

Open robertwhbaldwin opened 2 years ago

robertwhbaldwin commented 2 years ago

Hello,

I'm trying to run smoove call on a cohort of 10-25 WGS samples sequenced to 20X coverage. Species is disploid and genome size ~2G.

I have 65G memory. Memory usage is 2-4 G at the start and then starts to expand quickly and eventually things get killed. I've included the log file below, ending with the "Killed" message.

I don't know why this is happening and thought that someone might be able to identify the problem.

I'm rerunning it now with a subset of samples (~5 samples) to see if this works. Does running smoove for each sample separately make any difference in terms of the output than joint calling over all samples simultaneously?

Thank You. - Robert

/home/robert/tools/smoove call -x --name RHF --fasta /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa -p 8 --genotype /data/BAMS/northern/*.bam [smoove] 2021/07/15 07:43:46 starting with version 0.2.7 [smoove] 2021/07/15 07:43:48 calculating bam stats for 10 bams [smoove] 2021/07/15 07:46:49 done calculating bam stats [smoove]: ([E]lumpy-filter) 2021/07/15 07:53:50 [lumpy_filter] extracted splits and discordants from 172392782 total aligned reads [smoove]:2021/07/15 07:54:36 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:3m19.388178s system-time:14.901339s [smoove]:([E]lumpy-filter) 2021/07/15 07:54:49 [lumpy_filter] extracted splits and discordants from 176415567 total aligned reads [smoove]:2021/07/15 07:55:07 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:3m33.242629s system-time:15.519828s [smoove]:([E]lumpy-filter) 2021/07/15 07:55:38 [lumpy_filter] extracted splits and discordants from 192361201 total aligned reads [smoove]:2021/07/15 07:56:18 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:3m53.041825s system-time:17.127914s [smoove]:([E]lumpy-filter) 2021/07/15 07:56:59 [lumpy_filter] extracted splits and discordants from 189846067 total aligned reads [smoove]:2021/07/15 07:57:30 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:3m49.645732s system-time:17.792674s [smoove]:([E]lumpy-filter) 2021/07/15 08:00:39 [lumpy_filter] extracted splits and discordants from 179446169 total aligned reads [smoove]:2021/07/15 08:00:41 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:3m20.532419s system-time:15.603252s [smoove]:([E]lumpy-filter) 2021/07/15 08:02:07 [lumpy_filter] extracted splits and discordants from 130316442 total aligned reads [smoove]:([E]lumpy-filter) 2021/07/15 08:02:26 [lumpy_filter] extracted splits and discordants from 139819568 total aligned reads [smoove]:2021/07/15 08:03:55 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:5m41.094005s system-time:39.022714s [smoove]:2021/07/15 08:03:57 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:5m23.992311s system-time:37.995755s [smoove]:([E]lumpy-filter) 2021/07/15 08:04:42 [lumpy_filter] extracted splits and discordants from 171057162 total aligned reads [smoove]:([E]lumpy-filter) 2021/07/15 08:04:48 [lumpy_filter] extracted splits and discordants from 156353946 total aligned reads [smoove]:2021/07/15 08:06:17 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:6m54.871156s system-time:49.157644s [smoove]:2021/07/15 08:06:23 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:6m18.103409s system-time:45.441519s [smoove]:([E]lumpy-filter) 2021/07/15 08:06:24 [lumpy_filter] extracted splits and discordants from 153337745 total aligned reads [smoove]:2021/07/15 08:06:37 finished process: lumpy-filter (set -eu; lumpy_filter -f /home/robert/assembly/GCF_014851395.1_ASM1485139v1_genomic.fa /data/BAMS/no) in user-time:5m43.032757s system-time:42.752931s [smoove] 2021/07/15 08:08:02 removed 3031945 alignments out of 4222370 (71.81%) with low mapq, depth > 1000, or from excluded chroms from RHF05338.disc.bam in 85 seconds [smoove] 2021/07/15 08:08:02 removed 194472 alignments out of 4222370 (4.61%) that were bad interchromosomals or flanked-splitters from RHF05338.disc.bam [smoove] 2021/07/15 08:08:09 kept 34487 putative orphans [smoove] 2021/07/15 08:08:09 removed 24040 discordant orphans in 3 seconds [smoove] 2021/07/15 08:08:12 removed 697556 singletons and isolated interchromosomals of 995953 reads (70.04%) from RHF05338.disc.bam in 9 seconds [smoove] 2021/07/15 08:08:12 298397 reads (7.07%) of the original 4222370 remain from RHF05338.disc.bam [smoove] 2021/07/15 08:08:13 removed 3529103 alignments out of 4856216 (72.67%) with low mapq, depth > 1000, or from excluded chroms from RHF05342.disc.bam in 95 seconds [smoove] 2021/07/15 08:08:13 removed 222032 alignments out of 4856216 (4.57%) that were bad interchromosomals or flanked-splitters from RHF05342.disc.bam [smoove] 2021/07/15 08:08:13 removed 2900126 alignments out of 4093323 (70.85%) with low mapq, depth > 1000, or from excluded chroms from RHF05347.disc.bam in 96 seconds [smoove] 2021/07/15 08:08:13 removed 201299 alignments out of 4093323 (4.92%) that were bad interchromosomals or flanked-splitters from RHF05347.disc.bam [smoove] 2021/07/15 08:08:15 removed 3335111 alignments out of 4664424 (71.50%) with low mapq, depth > 1000, or from excluded chroms from RHF05358.disc.bam in 98 seconds [smoove] 2021/07/15 08:08:15 removed 217337 alignments out of 4664424 (4.66%) that were bad interchromosomals or flanked-splitters from RHF05358.disc.bam [smoove] 2021/07/15 08:08:18 kept 32732 putative orphans [smoove] 2021/07/15 08:08:18 removed 27903 discordant orphans in 3 seconds [smoove] 2021/07/15 08:08:19 kept 37088 putative orphans [smoove] 2021/07/15 08:08:19 removed 32878 discordant orphans in 3 seconds [smoove] 2021/07/15 08:08:21 removed 690221 singletons and isolated interchromosomals of 991898 reads (69.59%) from RHF05347.disc.bam in 8 seconds [smoove] 2021/07/15 08:08:21 301677 reads (7.37%) of the original 4093323 remain from RHF05347.disc.bam [smoove] 2021/07/15 08:08:21 kept 37887 putative orphans [smoove] 2021/07/15 08:08:21 removed 26274 discordant orphans in 3 seconds [smoove] 2021/07/15 08:08:22 removed 772487 singletons and isolated interchromosomals of 1105081 reads (69.90%) from RHF05342.disc.bam in 9 seconds [smoove] 2021/07/15 08:08:22 332594 reads (6.85%) of the original 4856216 remain from RHF05342.disc.bam [smoove] 2021/07/15 08:08:24 removed 780588 singletons and isolated interchromosomals of 1111976 reads (70.20%) from RHF05358.disc.bam in 9 seconds [smoove] 2021/07/15 08:08:24 331388 reads (7.10%) of the original 4664424 remain from RHF05358.disc.bam [smoove] 2021/07/15 08:08:42 removed 1355068 alignments out of 1998885 (67.79%) with low mapq, depth > 1000, or from excluded chroms from RHF05301.split.bam in 20 seconds [smoove] 2021/07/15 08:08:42 removed 150502 alignments out of 1998885 (7.53%) that were bad interchromosomals or flanked-splitters from RHF05301.split.bam [smoove] 2021/07/15 08:08:43 removed 1207362 alignments out of 1794274 (67.29%) with low mapq, depth > 1000, or from excluded chroms from RHF05302.split.bam in 19 seconds [smoove] 2021/07/15 08:08:43 removed 137231 alignments out of 1794274 (7.65%) that were bad interchromosomals or flanked-splitters from RHF05302.split.bam [smoove] 2021/07/15 08:08:48 removed 3203001 alignments out of 4412465 (72.59%) with low mapq, depth > 1000, or from excluded chroms from RHF05397.disc.bam in 35 seconds [smoove] 2021/07/15 08:08:48 removed 203693 alignments out of 4412465 (4.62%) that were bad interchromosomals or flanked-splitters from RHF05397.disc.bam [smoove] 2021/07/15 08:08:52 kept 34684 putative orphans [smoove] 2021/07/15 08:08:52 removed 24746 discordant orphans in 2 seconds [smoove] 2021/07/15 08:08:55 removed 704585 singletons and isolated interchromosomals of 1005771 reads (70.05%) from RHF05397.disc.bam in 7 seconds [smoove] 2021/07/15 08:08:55 301186 reads (6.83%) of the original 4412465 remain from RHF05397.disc.bam [smoove] 2021/07/15 08:09:17 removed 1428862 alignments out of 2106456 (67.83%) with low mapq, depth > 1000, or from excluded chroms from RHF05338.split.bam in 22 seconds [smoove] 2021/07/15 08:09:17 removed 185289 alignments out of 2106456 (8.80%) that were bad interchromosomals or flanked-splitters from RHF05338.split.bam [smoove] 2021/07/15 08:09:28 kept 28017 putative orphans [smoove] 2021/07/15 08:09:28 removed 10363 split orphans in 6 seconds [smoove] 2021/07/15 08:09:30 removed 208083 singletons of 492305 reads (42.27%) from RHF05338.split.bam in 12 seconds [smoove] 2021/07/15 08:09:30 284222 reads (13.49%) of the original 2106456 remain from RHF05338.split.bam [smoove] 2021/07/15 08:09:57 removed 1677377 alignments out of 2472764 (67.83%) with low mapq, depth > 1000, or from excluded chroms from RHF05342.split.bam in 27 seconds [smoove] 2021/07/15 08:09:57 removed 211424 alignments out of 2472764 (8.55%) that were bad interchromosomals or flanked-splitters from RHF05342.split.bam [smoove] 2021/07/15 08:10:11 kept 35357 putative orphans [smoove] 2021/07/15 08:10:11 removed 14547 split orphans in 8 seconds [smoove] 2021/07/15 08:10:13 removed 244133 singletons of 583963 reads (41.81%) from RHF05342.split.bam in 16 seconds [smoove] 2021/07/15 08:10:13 339830 reads (13.74%) of the original 2472764 remain from RHF05342.split.bam [smoove] 2021/07/15 08:10:31 kept 164385 putative orphans [smoove] 2021/07/15 08:10:31 removed 668 split orphans in 94 seconds [smoove] 2021/07/15 08:10:33 removed 2601 singletons of 449681 reads (0.58%) from RHF05302.split.bam in 110 seconds [smoove] 2021/07/15 08:10:33 447080 reads (24.92%) of the original 1794274 remain from RHF05302.split.bam [smoove] 2021/07/15 08:10:36 kept 168436 putative orphans [smoove] 2021/07/15 08:10:36 removed 762 split orphans in 99 seconds [smoove] 2021/07/15 08:10:38 removed 2879 singletons of 493315 reads (0.58%) from RHF05301.split.bam in 116 seconds [smoove] 2021/07/15 08:10:38 490436 reads (24.54%) of the original 1998885 remain from RHF05301.split.bam [smoove] 2021/07/15 08:10:38 removed 1393084 alignments out of 2069913 (67.30%) with low mapq, depth > 1000, or from excluded chroms from RHF05347.split.bam in 24 seconds [smoove] 2021/07/15 08:10:38 removed 181545 alignments out of 2069913 (8.77%) that were bad interchromosomals or flanked-splitters from RHF05347.split.bam [smoove] 2021/07/15 08:10:50 kept 30517 putative orphans [smoove] 2021/07/15 08:10:50 removed 12489 split orphans in 7 seconds [smoove] 2021/07/15 08:10:52 removed 212202 singletons of 495284 reads (42.84%) from RHF05347.split.bam in 14 seconds [smoove] 2021/07/15 08:10:52 283082 reads (13.68%) of the original 2069913 remain from RHF05347.split.bam [smoove] 2021/07/15 08:10:55 removed 1205923 alignments out of 1797653 (67.08%) with low mapq, depth > 1000, or from excluded chroms from RHF05350.split.bam in 21 seconds [smoove] 2021/07/15 08:10:55 removed 137216 alignments out of 1797653 (7.63%) that were bad interchromosomals or flanked-splitters from RHF05350.split.bam [smoove] 2021/07/15 08:11:02 removed 1687161 alignments out of 2485258 (67.89%) with low mapq, depth > 1000, or from excluded chroms from RHF05357.split.bam in 23 seconds [smoove] 2021/07/15 08:11:02 removed 182539 alignments out of 2485258 (7.34%) that were bad interchromosomals or flanked-splitters from RHF05357.split.bam [smoove] 2021/07/15 08:11:17 removed 1547185 alignments out of 2273613 (68.05%) with low mapq, depth > 1000, or from excluded chroms from RHF05358.split.bam in 24 seconds [smoove] 2021/07/15 08:11:17 removed 199322 alignments out of 2273613 (8.77%) that were bad interchromosomals or flanked-splitters from RHF05358.split.bam [smoove] 2021/07/15 08:11:27 kept 29793 putative orphans [smoove] 2021/07/15 08:11:27 removed 10617 split orphans in 5 seconds [smoove] 2021/07/15 08:11:29 removed 226579 singletons of 527106 reads (42.99%) from RHF05358.split.bam in 12 seconds [smoove] 2021/07/15 08:11:29 300527 reads (13.22%) of the original 2273613 remain from RHF05358.split.bam [smoove] 2021/07/15 08:11:53 removed 1444973 alignments out of 2126656 (67.95%) with low mapq, depth > 1000, or from excluded chroms from RHF05397.split.bam in 23 seconds [smoove] 2021/07/15 08:11:53 removed 186961 alignments out of 2126656 (8.79%) that were bad interchromosomals or flanked-splitters from RHF05397.split.bam [smoove] 2021/07/15 08:12:03 kept 28504 putative orphans [smoove] 2021/07/15 08:12:03 removed 10645 split orphans in 6 seconds [smoove] 2021/07/15 08:12:06 removed 210393 singletons of 494722 reads (42.53%) from RHF05397.split.bam in 13 seconds [smoove] 2021/07/15 08:12:06 284329 reads (13.37%) of the original 2126656 remain from RHF05397.split.bam [smoove] 2021/07/15 08:12:27 removed 1374656 alignments out of 2045653 (67.20%) with low mapq, depth > 1000, or from excluded chroms from RHF05398.split.bam in 21 seconds [smoove] 2021/07/15 08:12:27 removed 157085 alignments out of 2045653 (7.68%) that were bad interchromosomals or flanked-splitters from RHF05398.split.bam [smoove] 2021/07/15 08:12:35 kept 163490 putative orphans [smoove] 2021/07/15 08:12:35 removed 793 split orphans in 84 seconds [smoove] 2021/07/15 08:12:37 removed 2840 singletons of 454514 reads (0.62%) from RHF05350.split.bam in 102 seconds [smoove] 2021/07/15 08:12:37 451674 reads (25.13%) of the original 1797653 remain from RHF05350.split.bam [smoove] 2021/07/15 08:13:57 kept 219280 putative orphans [smoove] 2021/07/15 08:13:57 removed 875 split orphans in 150 seconds [smoove] 2021/07/15 08:14:00 removed 3462 singletons of 615558 reads (0.56%) from RHF05357.split.bam in 178 seconds [smoove] 2021/07/15 08:14:00 612096 reads (24.63%) of the original 2485258 remain from RHF05357.split.bam [smoove] 2021/07/15 08:14:19 kept 174179 putative orphans [smoove] 2021/07/15 08:14:19 removed 755 split orphans in 95 seconds [smoove] 2021/07/15 08:14:20 removed 3012 singletons of 513912 reads (0.59%) from RHF05398.split.bam in 114 seconds [smoove] 2021/07/15 08:14:20 510900 reads (24.97%) of the original 2045653 remain from RHF05398.split.bam [smoove] 2021/07/15 08:15:44 removed 30539654 alignments out of 130316442 (23.43%) with low mapq, depth > 1000, or from excluded chroms from RHF05350.disc.bam in 547 seconds [smoove] 2021/07/15 08:15:44 removed 629811 alignments out of 130316442 (0.48%) that were bad interchromosomals or flanked-splitters from RHF05350.disc.bam [smoove] 2021/07/15 08:16:07 removed 37055508 alignments out of 156353946 (23.70%) with low mapq, depth > 1000, or from excluded chroms from RHF05301.disc.bam in 570 seconds [smoove] 2021/07/15 08:16:07 removed 637914 alignments out of 156353946 (0.41%) that were bad interchromosomals or flanked-splitters from RHF05301.disc.bam [smoove] 2021/07/15 08:16:14 removed 32425137 alignments out of 139819568 (23.19%) with low mapq, depth > 1000, or from excluded chroms from RHF05302.disc.bam in 576 seconds [smoove] 2021/07/15 08:16:14 removed 578778 alignments out of 139819568 (0.41%) that were bad interchromosomals or flanked-splitters from RHF05302.disc.bam [smoove] 2021/07/15 08:16:38 removed 40399593 alignments out of 171057162 (23.62%) with low mapq, depth > 1000, or from excluded chroms from RHF05357.disc.bam in 601 seconds [smoove] 2021/07/15 08:16:38 removed 732772 alignments out of 171057162 (0.43%) that were bad interchromosomals or flanked-splitters from RHF05357.disc.bam [smoove] 2021/07/15 08:17:11 removed 35358941 alignments out of 153337745 (23.06%) with low mapq, depth > 1000, or from excluded chroms from RHF05398.disc.bam in 530 seconds [smoove] 2021/07/15 08:17:11 removed 619711 alignments out of 153337745 (0.40%) that were bad interchromosomals or flanked-splitters from RHF05398.disc.bam Killed

brentp commented 2 years ago

Hi, I would use the populatoin calling instructions, or you can try running with smoove call -F which doesn't do the extra filtering that sometimes uses a lot of memory.