zhaoming159753 / bedtools

Automatically exported from code.google.com/p/bedtools
0 stars 0 forks source link

flank tool generates out of chromosome boundaries #133

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
# let me describe how to reproduce this error
# aim is to get downstream 1000 bp for each gene with "bedtools flank". for 
clarity, awk is used to pick the intervals which are not equal to 1000 bp, in 
order to describe the bug

$ flankBed -i genes5b60.bed -g zm5b60.genome -r 1000 -l 0 -s | awk 
'($3-$2)!=1000'
7   176764628   353529390   GRMZM2G124745   0   +
Pt  139807  280191  GRMZM5G818111   0   +
Pt  140048  280432  GRMZM5G866064   0   +
Pt  140361  280745  GRMZM5G855343   0   +

# if a gene is close to end of chromosome, it's normal to have flank regions 
shorter than 1000 bp. However these 4 intervals are out of chromosome boundary!
# let's look at the genome file:(only relevant chromosomes are shown)

$ cat zm5b60.genome
7   176764762
Pt  140384

# now, let's check if the gene intervals 
$ grep -f <(flankBed -i genes5b60.bed -g zm5b60.genome -r 1000 -l 0 -s | awk 
'($3-$2)!=1000' | cut -f4) genes5b60.bed 
7   176761204   176764628   GRMZM2G124745   0   +
Pt  138323  139807  GRMZM5G818111   0   +
Pt  139824  140048  GRMZM5G866064   0   +
Pt  140068  140361  GRMZM5G855343   0   +

# those genes are already within chromosome boundary but flank tool generates 
intervals outside of chromosome

# similar problem can be reproduced with another bed file. I used the human 
genome gene intervals and two genes generated downstream region out of 
chromosome boundary. This time, it happens only in chrM

$ flankBed -i hg19_all_genes.bed -g hg19.genome -r 1000 -l 0 -s | awk 
'($3-$2)!=1000'
chrM    15887   32458   ENSG00000198727 1   +
chrM    15953   32524   ENSG00000210195 1   +

# bed and genome files are included so that you can reproduce and troubleshoot 
this problem

What is the expected output? What do you see instead?
Flanked regions should not be located out of chromosome boundaries

What version of the product are you using? On what operating system?
Version: v2.16.2
Operating system : Linux (64bit)

Please provide any additional information below.

Original issue reported on code.google.com by alperyil...@gmail.com on 16 Aug 2012 at 4:02

Attachments:

GoogleCodeExporter commented 8 years ago
Could you try this with the latest version in the Github repository?  A few 
bugs have been fixed in flankbed since 2.16.2.

Original comment by aaronqui...@gmail.com on 16 Aug 2012 at 7:58

GoogleCodeExporter commented 8 years ago
I downloaded the Github version (v2.16.2-ed1e1af) and it does not contain the 
above mentioned bug, desired outcome has been acquired.. Thanks..

Original comment by alperyil...@gmail.com on 17 Aug 2012 at 1:02

GoogleCodeExporter commented 8 years ago

Original comment by aaronqui...@gmail.com on 30 Oct 2012 at 6:04