arq5x / bedtools2

bedtools - the swiss army knife for genome arithmetic
MIT License
935 stars 287 forks source link

bedtools intersect #122

Closed juliafe closed 9 years ago

juliafe commented 10 years ago

bedtools intersect gives me different results when using vcf files instead of bed files (which are identical). Why is that or is this a bug?

nkindlon commented 10 years ago

Hi Julia, This is Neil Kindlon in the Quinlan lab. I'm the developer that worked on recent versions of intersect. Can you please paste an example of the differences you're seeing, the command line you are using, and the version you're running? Also, can you send the data files you're using to me at nek3d@virginia.edu? Thanks!

juliafe commented 10 years ago

Hi Neil,

Thank you for your quick response.

Here are the command lines: /bedtools-2.21.0/bin/bedtools intersect -a vcfFileA.vcf -b vcfFileB.vcf -f 0.8 -r -wao > overlapVCF.txt /bedtools-2.21.0/bin/bedtools intersect -a bedFileA.bed -b bedFileB.bed -f 0.8 -r -wao > overlapBED.txt

As for the differences - here is an example

Bed file comparison chr1_scaffold_2 40168 40423 DEL00000012 chr1_scaffold_2 40119 40389 DEL00000010 221

vcf file comparison chr1_scaffold_2 40168 DEL00000012 N . LowQual IMPRECISE;CIEND=-1,1;CIPOS=-1,1;SVTYPE=DEL;SVMETHOD=EMBL.DELLYv0.5.5;CHR2=chr1_scaffold_2;END=40423;SVLEN=255;CT=3to5;PE=2;MAPQ=60 GT:GL:GQ:FT:RC:DR:DV:RR:RV 1/1:-24,-1.20412,0:12:LowQual:16:0:4:0:0 . -1 . . . -1 . . . . 0

Thank you!

Julia

On 03 Oct 2014, at 21:38, Neil Kindlon notifications@github.com wrote:

Hi Julia, This is Neil Kindlon in the Quinlan lab. I'm the developer that worked on recent versions of intersect. Can you please paste an example of the differences you're seeing, the command line you are using, and the version you're running? Also, can you send the data files you're using to me at nek3d@virginia.edu? Thanks!

— Reply to this email directly or view it on GitHub.

arq5x commented 10 years ago

Hi Julia and Neil,

This is the result of a long standing limitation in the way bedtools interprets structural variants in a VCF file. The issue is that the end coordinate for SVs in a VCF is not properly computed and therefore the intersections with other files can be off (for SVs only). We need to fix this asap. Sorry for the problem.

juliafe commented 10 years ago

Hi Aaron,

Thank you very much for the response and the explanation.

Julia

On 06 Oct 2014, at 01:55, Aaron Quinlan notifications@github.com wrote:

Hi Julia and Neil,

This is the result of a long standing limitation in the way bedtools interprets structural variants in a VCF file. The issue is that the end coordinate for SVs in a VCF is not properly computed and therefore the intersections with other files can be off (for SVs only). We need to fix this asap. Sorry for the problem.

— Reply to this email directly or view it on GitHub.

arq5x commented 9 years ago

Fixed with 0b8612d36da