ukaraoz / ea-utils

Automatically exported from code.google.com/p/ea-utils
0 stars 0 forks source link

issue with gtf2bed #16

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Hi Erik,

Thank you so much for writing a handy script gtf2bed script; saved me a few 
hours today (well, kinda!). I think I might have stumbled across some error in 
your script. Using the attached .gtf as input (from gencodeV16), your script 
outputs:

chr1    23337326        23342343        ENST00000566855.1       0       -       
23337393        23340540        0       3       268,163,79,     0,3049,4938,

whereas the proper output should be

chr1    23337326        23342343        ENST00000566855.1       0       -       
23337393        23342266        0       3       268,163,79,     0,3049,4938,

I suspect this is an issue with the start codon annotation spanning across a 
splice site. Unfortunately, I am not a perl aficionado and could not debug it 
myself. I'm also way to lazy to setup an account to comment on the eq-utils 
wiki (sorry!).

In any case, I thought I should toss you an email as a heads up.

Kind regards,

Martin Smith

Original issue reported on code.google.com by earone...@gmail.com on 1 May 2013 at 6:06

Attachments:

GoogleCodeExporter commented 8 years ago
It is a start codon spanning a splice site.  We should fix this soon/

Original comment by earone...@gmail.com on 28 May 2013 at 5:08

GoogleCodeExporter commented 8 years ago
There's a bug in the gtf.   Should be start_codon, and stop_codon ... not with 
a space in the name.  Also there's a bunch of spaces, instead of tabs in some 
lines of the example.

Adjust the start_codon and stop_codon lines so that they encompass the codon 
properly and this will go away.

OF course, the program could take into account the cds... and that would fix it 
too.   But I think it's fine to require valid start/stop codons.

See example here http://www.gencodegenes.org/gencodeformat.html

Original comment by earone...@gmail.com on 12 Aug 2013 at 1:37