Open anoronh4 opened 2 years ago
Hope all is well. Just wondering if you need any further information/files from me to answer this question. By the way I'm using the quay.io/wtsicgp/brass:v6.3.4
container. As per the specification, VCF files should contain 1-based coordinates: https://samtools.github.io/hts-specs/VCFv4.3.pdf
just wondering if there was any update on this.
i have run into this issue again with a different sample:
$ zgrep -v "^#" /path/to/brass/sample.annot.vcf.gz | awk -F"\t" '$2 == 0'
3 0 5631_2 0 [3:80865736[0 5 . SVTYPE=BND;MATEID=5631_1;TSRDS=A00227:541:HM73TDSX3:2:1321:30101:27132,A00227:541:HM73TDSX3:3:1137:14832:13025,A00227:541:HM73TDSX3:3:1163:23972:14669,A00227:541:HM73TDSX3:4:2455:30120:12540,A00333:514:HLYMTDSX3:2:1409:8630:9455;BKDIST=80865737;SVCLASS=inversion RC:PS 0:0:5
3 0 5695_2 0 [3:80870254[0 8 . SVTYPE=BND;MATEID=5695_1;TSRDS=A00227:541:HM73TDSX3:1:1328:22155:15562,A00227:541:HM73TDSX3:1:2469:27588:14857,A00227:541:HM73TDSX3:2:1608:9751:25708,A00227:541:HM73TDSX3:2:1647:23728:25739,A00227:541:HM73TDSX3:2:2305:20654:23077,A00227:541:HM73TDSX3:3:2203:7726:14904,A00333:514:HLYMTDSX3:2:1433:9932:16313,A00333:514:HLYMTDSX3:2:2636:32958:14575;BKDIST=80870255;SVCLASS=inversion RC:PS 0:0 0:8
3 0 6174_2 0 0]3:80911681] 4 . SVTYPE=BND;MATEID=6174_1;TSRDS=A00227:541:HM73TDSX3:1:1477:31159:22138,A00227:541:HM73TDSX3:1:2540:17987:9596,A00227:541:HM73TDSX3:4:1518:21640:27101,A00227:541:HM73TDSX3:4:2214:22019:20870;BKDIST=80911682;SVCLASS=inversion RC:PS 0:0 0:4
This time, the issue appears 3 times in the same output vcf. Does your team plan to look into this issue at all?
interestingly, when i searched the brm.bam file for the first read mentioned in those vcf lines, it does not reconcile with the vcf:
$ samtools view /path/to/brass/tmpBrass/sample.brm.bam | grep A00227:541:HM73TDSX3:2:1321:30101:27132
A00227:541:HM73TDSX3:2:1321:30101:27132 145 3 80865728 60 39S112M = 80866798 960 AGAGCGATCAGGCAAGAGAAGAAATGAAGGGCATTCAAGCAGGAATACCTCAACCCCTCCAACACAGCAGGCACAGCAGCTTCCCAAATTCGAGGGGACGGAGAACAAAGCTGGGGGCCTGTTACCAGTCCCCAAGAGTTAGAGTCCACAG 1<:=5<+<=;<==;;=;=;;=8;77=;;==:=:=<=;;==;==;:(:=><=98=88>9=8,;>;>1:>1>>>;><>>=?@?>???==<?:7?=??8?<7>?=;<;,<<<>?=>>98;>?==:<;>=;<<====;2=<<<::<::::<9;98 SA:Z:3,80866832,-,45M106S,60,0; MC:Z:79M72S MD:Z:112 PG:Z:MarkDuplicateRG:Z:TA00227@541@HM73TDSX3@2 NM:i:0 AS:i:112 XS:i:33
A00227:541:HM73TDSX3:2:1321:30101:27132 97 3 80866798 60 79M72S = 80865728 -960 ACTCCTATTCAACACAGTACTAGAAGTCCTGGCCAGAGCGATCAGGCAAGAGAAGAAATGAAGGGCATTCAAGCAGGAATACCTCAACCCCTCCAACACAGCAGGCACAGCAGCTTCCCAAATTCGAGGGGACGGAGAACAAAGCTGGGGG ;<<??688:=;=<<<<?;<=<<?=>?;>><>>>>=?=?>6=;>=?>?>:@>@>?@>:?<?>?@???><=?>?@?>@?>?<==?=?=?>???=??=?=>=>@?>@??>>>@?>@?:=???>?8;;>6>?>>>==6>=?=8==*??><??>=; SA:Z:3,80865728,+,73S78M,60,0; MC:Z:39S112M MD:Z:79 PG:Z:MarkDuplicates RG:Z:TA00227@541@HM73TDSX3@2 NM:i:0 AS:i:79 XS:i:31
I am having similar issue, any update BRASS team? thanks!
Would need some small test input to try to reproduce.
thanks for your response. i will need some time to come up with a small test input and make sure it can reproduce the issue. is there a way to run brass on a bam that has been filtered by region?
in the meantime, i have done a little more work to isolate the problem, finding the source of the invalid coordinate in the .r4
intermediate file:
$ cat brass/tumor_vs_normal.r4 | awk -F"\t" '$12 == 0'
3 80865824 80865850 3 80867060 80867071 5631 5 - + 80865736 0 80865736 (0) 80866775 (0) 0 (0) 0 (0) A00227:541:HM73TDSX3:2:2333:25129:13479,A00227:541:HM73TDSX3:3:2636:13286:8218,A00227:541:HM73TDSX3:4:2252:20066:33207,A00333:514:HLYMTDSX3:2:1106:31195:31657,A00333:514:HLYMTDSX3:2:1116:29740:36933,A00333:514:HLYMTDSX3:2:1409:8630:9455,A00333:514:HLYMTDSX3:2:1657:3875:9392,A00333:514:HLYMTDSX3:2:2232:2311:15452 8 1 0 0
3 80869882 80870017 3 80870790 80870983 5695 8 - + 80870254 0 80870254 (2) 80870111 (2) 0 (0) 0 (0) A00227:541:HM73TDSX3:2:2464:15980:14700,A00227:541:HM73TDSX3:3:1411:26874:24455,A00227:541:HM73TDSX3:3:2366:11333:22811,A00227:541:HM73TDSX3:3:2640:26503:3552,A00227:541:HM73TDSX3:4:1336:12292:19366,A00227:541:HM73TDSX3:4:1650:16188:34068,A00227:541:HM73TDSX3:4:1659:11234:11303,A00227:541:HM73TDSX3:4:2158:5367:6715,A00227:541:HM73TDSX3:4:2515:3821:20572,A00227:541:HM73TDSX3:4:2515:6451:20556,A00227:541:HM73TDSX3:4:2643:20383:3443,A00333:514:HLYMTDSX3:2:1506:30427:16047,A00333:514:HLYMTDSX3:2:1616:18059:12790,A00333:514:HLYMTDSX3:2:1646:27154:33364 14 2 0 0
3 80911970 80911978 3 80913732 80913783 6174 4 + - 80911681 0 80911624 (2) 80911681 (2) 0 (0) 0 (0) A00227:541:HM73TDSX3:1:2404:15302:31093,A00227:541:HM73TDSX3:3:1428:22417:22091 2 0
The 12th column of this file represents the "high_end_bkpt
" and is calculated using the following lines:
https://github.com/cancerit/BRASS/blob/7883d8cd3db614be2b06e32c288e7db344d1f2b0/perl/bin/get_abs_bkpts_from_clipped_reads.pl#L509-L514
seems to me that this should not evaluate to 0, or should get lead to the variant being filtered out?
This column is passed on to .r5
and .r6
, and then gets added to .groups.clean.bedpe
with the following lines:
https://github.com/cancerit/BRASS/blob/7883d8cd3db614be2b06e32c288e7db344d1f2b0/perl/bin/collate_rg_regions.pl#L72-L73
Does this help identify the source of the problem?
Hi,
To tell you the truth I haven't previously worked with BRASS. However since I didn't see anyone else picking up the issue I decided to give it a try. That is why asked for the test otherwise it might take me too long to figure out.
From: anoronh4 @.> Sent: Tuesday, October 4, 2022 9:10 PM To: cancerit/BRASS @.> Cc: Raul Alcantara @.>; Comment @.> Subject: Re: [cancerit/BRASS] getting POS of 0 (Issue #110)
thanks for your response. i will need some time to come up with a small test input and make sure it can reproduce the issue. is there a way to run brass on a bam that has been filtered by region?
in the meantime, i have done a little more work to isolate the problem, finding an issue in the .r4 intermediate file:
$ cat brass/tumor_vs_normal.r4 | awk -F"\t" '$12 == 0' 3 80865824 80865850 3 80867060 80867071 5631 5 - + 80865736 0 80865736 (0) 80866775 (0) 0 (0) 0 (0) A00227:541:HM73TDSX3:2:2333:25129:13479,A00227:541:HM73TDSX3:3:2636:13286:8218,A00227:541:HM73TDSX3:4:2252:20066:33207,A00333:514:HLYMTDSX3:2:1106:31195:31657,A00333:514:HLYMTDSX3:2:1116:29740:36933,A00333:514:HLYMTDSX3:2:1409:8630:9455,A00333:514:HLYMTDSX3:2:1657:3875:9392,A00333:514:HLYMTDSX3:2:2232:2311:15452 8 1 0 0 3 80869882 80870017 3 80870790 80870983 5695 8 - + 80870254 0 80870254 (2) 80870111 (2) 0 (0) 0 (0) A00227:541:HM73TDSX3:2:2464:15980:14700,A00227:541:HM73TDSX3:3:1411:26874:24455,A00227:541:HM73TDSX3:3:2366:11333:22811,A00227:541:HM73TDSX3:3:2640:26503:3552,A00227:541:HM73TDSX3:4:1336:12292:19366,A00227:541:HM73TDSX3:4:1650:16188:34068,A00227:541:HM73TDSX3:4:1659:11234:11303,A00227:541:HM73TDSX3:4:2158:5367:6715,A00227:541:HM73TDSX3:4:2515:3821:20572,A00227:541:HM73TDSX3:4:2515:6451:20556,A00227:541:HM73TDSX3:4:2643:20383:3443,A00333:514:HLYMTDSX3:2:1506:30427:16047,A00333:514:HLYMTDSX3:2:1616:18059:12790,A00333:514:HLYMTDSX3:2:1646:27154:33364 14 2 0 0 3 80911970 80911978 3 80913732 80913783 6174 4 + - 80911681 0 80911624 (2) 80911681 (2) 0 (0) 0 (0) A00227:541:HM73TDSX3:1:2404:15302:31093,A00227:541:HM73TDSX3:3:1428:22417:22091 2 0
The 12th column of this file represents the "high_end_bkpt" and is calculated using the following lines: https://github.com/cancerit/BRASS/blob/7883d8cd3db614be2b06e32c288e7db344d1f2b0/perl/bin/get_abs_bkpts_from_clipped_reads.pl#L509-L514 . seems to me that this should not evaluate to 0, or should get lead to the variant being filtered out?
This value is passed to .r5 and .r6, and then gets added to .groups.clean.bedpe with the following lines: https://github.com/cancerit/BRASS/blob/7883d8cd3db614be2b06e32c288e7db344d1f2b0/perl/bin/collate_rg_regions.pl#L72-L73
Does this help identify the source of the problem?
— Reply to this email directly, view it on GitHubhttps://github.com/cancerit/BRASS/issues/110#issuecomment-1267522008, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEAAUCERBILLI7XLJTUGBKDWBSFJXANCNFSM5WC27X2A. You are receiving this because you commented.Message ID: @.***>
I ran brass on several samples and in one sample i got the following record:
position 0 looks pretty unusual to me and i was not able to validate the call with any of 3 other callers (delly, manta, svaba). this is causing issues for me because certain downstream tools will not handle a position of 0 (
pysam
for example).