brentp / smoove

structural variant calling and genotyping with existing tools, but, smoothly.
Apache License 2.0
222 stars 21 forks source link

Smoove not working !!!!!! Failed to read from standard input: unknown file type panic: write |1: broken pipe #200

Open mansi800 opened 1 year ago

mansi800 commented 1 year ago

@brentp Smoove is not working even after my bam size is quite good. Command used: smoove call --outdir test_mansi/SRR6745447_output/smoove_output/ --name SRR6745447 --fasta part_II/references/hg38.fa -p 1 --genotype part_II/output/fq2bam_no_indels/SRR6745447_sorted.bam

The size of bam is : 72 GB(WGS bam )

smoove --help output: smoove version: 0.2.8

smoove calls several programs. Those with 'Y' are found on your $PATH. Only those with '*' are required.

[Y] bgzip [ sort -> (compress) -> index ] [Y] gsort [(sort) -> compress -> index ] [Y] tabix [ sort -> compress -> (index)] [Y] lumpy [Y] lumpy_filter [Y] samtools [Y] svtyper [Y] mosdepth [extra filtering of split and discordant files for better scaling]

[ ] duphold [(optional) annotate calls with depth changes] [ ] svtools [only needed for large cohorts].

Available sub-commands are below. Each can be run with -h for additional help.

call : call lumpy (and optionally svtyper) merge : merge and sort (using svtools) calls from multiple samples genotype : parallelize svtyper on an input VCF paste : square final calls from multiple samples (each with same number of variants) plot-counts : plot counts of split, discordant reads before, after smoove filtering annotate : annotate a VCF with gene and quality of SV call hipstr : run hipSTR in parallel duphold : run duphold in parallel (this can be done by adding a flag to call or genotype) Error log: [smoove] 2022/07/06 19:24:11 starting with version 0.2.8 [smoove] 2022/07/06 19:24:11 calculating bam stats for 1 bams [smoove] 2022/07/06 19:24:20 done calculating bam stats [smoove] 2022/07/06 19:24:20 removed 0 alignments out of 3232 (0.00%) with low mapq, depth > 1000, or from excluded chroms from sample.disc.bam in 1 seconds [smoove] 2022/07/06 19:24:20 removed 0 alignments out of 3232 (0.00%) that were bad interchromosomals or flanked-splitters from sample.disc.bam [smoove] 2022/07/06 19:24:20 kept 0 putative orphans [smoove] 2022/07/06 19:24:20 removed 0 discordant orphans in 0 seconds [smoove] 2022/07/06 19:24:20 removed 0 singletons and isolated interchromosomals of 3232 reads (0.00%) from sample.disc.bam in 0 seconds [smoove] 2022/07/06 19:24:20 3232 reads (100.00%) of the original 3232 remain from sample.disc.bam [smoove] 2022/07/06 19:24:21 removed 0 alignments out of 6258 (0.00%) with low mapq, depth > 1000, or from excluded chroms from sample.split.bam in 1 seconds [smoove] 2022/07/06 19:24:21 removed 0 alignments out of 6258 (0.00%) that were bad interchromosomals or flanked-splitters from sample.split.bam [smoove] 2022/07/06 19:24:21 kept 555 putative orphans [smoove] 2022/07/06 19:24:21 removed 22 split orphans in 0 seconds [smoove] 2022/07/06 19:24:21 removed 0 singletons of 6258 reads (0.00%) from sample.split.bam in 0 seconds [smoove] 2022/07/06 19:24:21 6258 reads (100.00%) of the original 6258 remain from sample.split.bam [smoove] 2022/07/06 19:24:21 starting lumpy [smoove] 2022/07/06 19:24:21 wrote lumpy command to /mnt/ST160/part_II/test_mansi/SRR6745447_output/smoove_output//SRR6745447-lumpy-cmd.sh [smoove] 2022/07/06 19:24:21 writing sorted, indexed file to /mnt/ST160/part_II/test_mansi/SRR6745447_output/smoove_output/SRR6745447-smoove.genotyped.vcf.gz [smoove] 2022/07/06 19:24:21 excluding variants with all unknown or homozygous reference genotypes [smoove] 2022/07/06 19:24:21 chr22 1000000 chr22 2000000 chr22 4000000 chr22 8000000 chr22 16000000 [smoove] 2022/07/06 19:24:21 Failed to read from standard input: unknown file type panic: write |1: broken pipe

goroutine 20112 [running]: github.com/brentp/smoove/svtyper.check(...) /home/brentp/src/smoove/svtyper/svtyper.go:33 github.com/brentp/smoove/svtyper.Svtyper.func2(0xc0007a8a20, 0xc0002317d0, 0x1, 0x1, 0x7ffc34697ca7, 0x40, 0xc000027830, 0x2c, 0xc001464000, 0xc001464008, ...) /home/brentp/src/smoove/svtyper/svtyper.go:189 +0xa85 created by github.com/brentp/smoove/svtyper.Svtyper /home/brentp/src/smoove/svtyper/svtyper.go:165 +0xb87

please help me out as soon as possible Thanks in Advance

brentp commented 1 year ago

Hi, your entire bam file has only 3232 discordant reads and 6258 split reads. that is far too few. Your bam file will not work with smoove. Either it's low coverage or targetted sequencing or some other problem.

karlkashofer commented 1 year ago

I have a similar problem:

` karl@sx249:~$ docker logs -f 50ecd495d6a6 [smoove] 2023/01/22 19:21:42 starting with version 0.2.8 [smoove] 2023/01/22 19:21:42 calculating bam stats for 1 bams [smoove] 2023/01/22 19:21:51 done calculating bam stats [smoove]: 2023/01/22 19:24:48 finished process: lumpy-filter (set -eu; lumpy_filter -f /cromwell-executions/Agilent_Exome_Single/26b18afb-a1be-4c1c-90e8-cbe3c1aaf) in user-time:4m48.784s system-time:41.264s [smoove] 2023/01/22 19:24:48 removed 0 alignments out of 0 (NaN%) with low mapq, depth > 1000, or from excluded chroms from D2757-7.disc.bam in 0 seconds [smoove] 2023/01/22 19:24:48 removed 0 alignments out of 0 (NaN%) that were bad interchromosomals or flanked-splitters from D2757-7.disc.bam [smoove] 2023/01/22 19:24:48 kept 0 putative orphans [smoove] 2023/01/22 19:24:48 removed 0 discordant orphans in 0 seconds [smoove] 2023/01/22 19:24:48 removed 0 singletons and isolated interchromosomals of 0 reads (NaN%) from D2757-7.disc.bam in 0 seconds [smoove] 2023/01/22 19:24:48 0 reads (NaN%) of the original 0 remain from D2757-7.disc.bam [smoove] 2023/01/22 19:26:00 removed 92380 alignments out of 1726870 (5.35%) with low mapq, depth > 1000, or from excluded chroms from D2757-7.split.bam in 72 seconds [smoove] 2023/01/22 19:26:00 removed 52793 alignments out of 1726870 (3.06%) that were bad interchromosomals or flanked-splitters from D2757-7.split.bam [smoove] 2023/01/22 20:23:57 kept 540516 putative orphans [smoove] 2023/01/22 20:23:57 removed 669 split orphans in 2759 seconds [smoove] 2023/01/22 20:24:12 removed 930793 singletons of 1581697 reads (58.85%) from D2757-7.split.bam in 3491 seconds [smoove] 2023/01/22 20:24:12 650904 reads (37.69%) of the original 1726870 remain from D2757-7.split.bam [smoove] 2023/01/22 20:24:13 starting lumpy [smoove] 2023/01/22 20:24:13 wrote lumpy command to ./structural-variants/smoove/D2757-7-lumpy-cmd.sh [smoove] 2023/01/22 20:24:13 writing sorted, indexed file to structural-variants/smoove/D2757-7-smoove.genotyped.vcf.gz [smoove] 2023/01/22 20:24:13 excluding variants with all unknown or homozygous reference genotypes [smoove] 2023/01/22 20:24:13 > gsort version 0.1.4 [smoove] 2023/01/22 20:24:13 chr1 1000000 [smoove] 2023/01/22 20:24:15 chr2 1000000 [smoove] 2023/01/22 20:24:15 [smoove] 2023/01/22 20:24:16 chr3 1000000 [smoove] 2023/01/22 20:24:18 chr4 1000000 [smoove] 2023/01/22 20:24:18 [smoove] 2023/01/22 20:24:19 chr5 1000000 [smoove] 2023/01/22 20:24:20 chr6 1000000 [smoove] 2023/01/22 20:24:21 chr7 1000000 [smoove] 2023/01/22 20:24:22 chr8 1000000 [smoove] 2023/01/22 20:24:23 chr9 1000000 [smoove] 2023/01/22 20:24:23 chr10 1000000 [smoove] 2023/01/22 20:24:23 [smoove] 2023/01/22 20:24:24 chr11 1000000 [smoove] 2023/01/22 20:24:26 chr12 1000000 [smoove] 2023/01/22 20:24:27 chr13 1000000 chr13 2000000 chr13 4000000 [smoove] 2023/01/22 20:24:27 chr13 8000000 chr13 16000000 chr13 32000000 [smoove] 2023/01/22 20:24:27 chr14 1000000 chr14 2000000 chr14 4000000 [smoove] 2023/01/22 20:24:27 chr14 8000000 chr14 16000000 chr14 32000000 [smoove] 2023/01/22 20:24:28 chr15 1000000 chr15 2000000 [smoove] 2023/01/22 20:24:28 chr15 4000000 chr15 8000000 chr15 16000000 chr15 32000000 [smoove] 2023/01/22 20:24:29 chr16 1000000 [smoove] 2023/01/22 20:24:29 [smoove] 2023/01/22 20:24:29 chr17 1000000 [smoove] 2023/01/22 20:24:30 chr18 1000000 [smoove] 2023/01/22 20:24:31 chr19 1000000 [smoove] 2023/01/22 20:24:32 chr20 1000000 [smoove] 2023/01/22 20:24:32 chr21 1000000 [smoove] 2023/01/22 20:24:32 chr21 2000000 chr21 4000000 [smoove] 2023/01/22 20:24:32 chr21 8000000 [smoove] 2023/01/22 20:24:32 [smoove] 2023/01/22 20:24:32 chr22 1000000 [smoove] 2023/01/22 20:24:32 chr22 2000000 chr22 4000000 [smoove] 2023/01/22 20:24:32 chr22 8000000 [smoove] 2023/01/22 20:24:32 [smoove] 2023/01/22 20:24:32 chr22 16000000 [smoove] 2023/01/22 20:24:32 [smoove] 2023/01/22 20:24:33 chrX 1000000 [smoove] 2023/01/22 20:24:33 chrY 1000000 chrY 2000000 chrY [smoove] 2023/01/22 20:24:33 4000000 [smoove] 2023/01/22 20:24:34 chrM 1000000 [smoove] 2023/01/22 20:24:34 chr1_KI270706v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr1_KI270708v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr1_KI270709v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr1_KI270710v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr1_KI270711v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr1_KI270712v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr1_KI270714v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr2_KI270716v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr3_GL000221v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr4_GL000008v2_random 1000000 [smoove] 2023/01/22 20:24:34 chr9_KI270719v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr9_KI270720v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr14_GL000009v2_random 1000000 [smoove] 2023/01/22 20:24:34 chr14_GL000225v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr14_KI270722v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr14_GL000194v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr14_KI270724v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr14_KI270725v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr15_KI270727v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr16_KI270728v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr16_KI270728v1_random 2000000 [smoove] 2023/01/22 20:24:34 chr17_GL000205v2_random 1000000 [smoove] 2023/01/22 20:24:34 chr17_KI270729v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr17_KI270730v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr22_KI270733v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr22_KI270735v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr22_KI270736v1_random 1000000 [smoove] 2023/01/22 20:24:34 chr22_KI270738v1_random 1000000 [smoove] 2023/01/22 20:24:34 chrY_KI270740v1_random 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270442v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270466v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270467v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270435v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270438v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270512v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270519v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270538v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270589v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270591v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_GL000195v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_GL000219v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_GL000220v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_GL000224v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270743v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270744v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270746v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270749v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270750v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270751v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270754v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270756v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_KI270757v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_GL000214v1 1000000 [smoove] 2023/01/22 20:24:34 chrUn_GL000216v2 1000000 [smoove] 2023/01/22 20:24:34 chrUn_GL000218v1 1000000 [smoove] 2023/01/22 20:24:34 panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x546d43]

goroutine 1 [running]: [smoove] 2023/01/22 20:24:34 bufio.NewReaderSize(...) /home/brentp/go/src/bufio/bufio.go:49 bufio.NewReader(...) /home/brentp/go/src/bufio/bufio.go:62 github.com/brentp/gsort.Sort(0x7ad380, 0x0, 0x7ad3a0, 0xc000159440, 0xc000159400, 0xaf0, 0x0, 0x0, 0x0) /home/brentp/go/go/src/github.com/brentp/gsort/gsort.go:116 +0x63 main.main() [smoove] 2023/01/22 20:24:34 /home/brentp/go/go/src/github.com/brentp/gsort/cmd/gsort/gsort.go:373 +0x513 [smoove] 2023/01/22 20:24:34 Failed to read from standard input: unknown file type [smoove] 2023/01/22 20:24:34 Failed to read from standard input: unknown file type panic: exit status 255

goroutine 1 [running]: github.com/brentp/smoove/svtyper.check(...) /home/brentp/src/smoove/svtyper/svtyper.go:33 github.com/brentp/smoove/svtyper.Svtyper(0xc14cc0, 0xc0047de0a8, 0x7ffe2c8eca32, 0xe3, 0xc0001977d0, 0x1, 0x1, 0x7ffe2c8ec9fe, 0x1c, 0x7ffe2c8eca22, ...) /home/brentp/src/smoove/svtyper/svtyper.go:226 +0x190a github.com/brentp/smoove/lumpy.Main() /home/brentp/src/smoove/lumpy/lumpy.go:363 +0x4ab main.main() /home/brentp/src/smoove/cmd/smoove/smoove.go:121 +0x1c4 karl@sx249:~$

`

This is Illumina WGS data (200x) from a human tumor, any idea why this fails ?

brentp commented 1 year ago

@karlkashofer note this:

[smoove] 2023/01/22 19:24:48 removed 0 alignments out of 0 (NaN%) with low mapq, depth > 1000, or from excluded chroms from D2757-7.disc.bam in 0 seconds
[smoove] 2023/01/22 19:24:48 removed 0 alignments out of 0 (NaN%) that were bad interchromosomals or flanked-splitters from D2757-7.disc.bam
[smoove] 2023/01/22 19:24:48 kept 0 putative orphans
[smoove] 2023/01/22 19:24:48 removed 0 discordant orphans in 0 seconds
[smoove] 2023/01/22 19:24:48 removed 0 singletons and isolated interchromosomals of 0 reads (NaN%) from D2757-7.disc.bam in 0 seconds
[smoove] 2023/01/22 19:24:48 0 reads (NaN%) of the original 0 remain from D2757-7.disc.bam

so you don't have any discordant or split reads. Make sure to align with bwa mem to use smoove.

karlkashofer commented 1 year ago

Hello !

Thanks a lot for taking the time to get to the root of this ! This is very weird, i would not expect 0 discordant reads from a whole exome with 200 mio reads. The reads were aligned with bwa-mem2, could that be a problem ? https://github.com/bwa-mem2/bwa-mem2

This is the mapping cmd line: bwa-mem2 mem -t 16 -C -M -R '@RG\tID:D2757-7\tSM:D2757-7\tLB:D2757-7\tPL:ILLUMINA' $ref ./Read1.fq.gz ./Read2.fq.gz | samtools view -@4 -h -u -f2 -U unmapped - | samtools sort -T . -@4 --write-index --reference $ref -o D2757-7.bam##idx##D2757-7.bam.bai -

I do split out unmapped reads, but i dont see how that could be responsible.

Thanks for your help, cheers, KK

karlkashofer commented 1 year ago

I found the reason for my problem.

I filtered all the interesting reads with the samtools -f2 flag. Apparently these reads are not "proper read pairs" by samtools definition. Now i have discordant reads and smoove works.

[smoove]: 2023/01/25 23:14:35 finished process: lumpy-filter (set -eu; lumpy_filter -f /cromwell-executions/Agilent_Exome_Single/e9bfd86e-ec90-4ed0-87e2-76c975abd) in user-time:6m39.524s system-time:47.496s [smoove] 2023/01/25 23:16:19 removed 332227 alignments out of 2844825 (11.68%) with low mapq, depth > 1000, or from excluded chroms from D2757-7.split.bam in 104 seconds [smoove] 2023/01/25 23:16:19 removed 97505 alignments out of 2844825 (3.43%) that were bad interchromosomals or flanked-splitters from D2757-7.split.bam [smoove] 2023/01/25 23:16:38 removed 548674 alignments out of 3474391 (15.79%) with low mapq, depth > 1000, or from excluded chroms from D2757-7.disc.bam in 123 seconds [smoove] 2023/01/25 23:16:38 removed 156855 alignments out of 3474391 (4.51%) that were bad interchromosomals or flanked-splitters from D2757-7.disc.bam

Sorry for the noise, all the best,K