circularRNA_full.txt - confusing exonCount/exonSize/exonOffsets

smahaffey commented 5 years ago

I'm merging results between samples using the output from the denovo circularRNA_full.txt file. I'm trying to compare the exon structure not just start and end coordinates to ensure a perfect match between samples. It seems like the exon length column[10] which is a comma seperated list must also be occasionally outputting numbers with commas in them if they are >=1000. Except that explanation doesn't make sense for row 3 below either or there are too few exon lengths and one exon has a length of 851,000,000,000bp. Then row 4 is an example of exonSizes that exceed 1000bp and don't have commas.

Here are some example rows from the file that I can't quite decipher other than that explanation. However the last row illustrates that some rows look as you would expect with exon sizes >=1000. So I'm not quite sure how to interpret this. I've cut off the remaining columns to simplify the example and I've attached a full file here. Brain.BNLx.2.full.txt

chr	start	end	name	score	strand	thickStart	thickEnd	itemRBG	ExonCount	exonSizes	exonOffsets
1	2012643	2017574	circular_RNA/1	0	-	2012643	2012643	0,0,0	2	*1,501,188*	*0,4743*
1	16514096	16530121	circular_RNA/2	0	+	16514096	16514096	0,0,0	2	*6,031,121*	*0,15904*
1	29319343	29369105	circular_RNA/2	1	+	29319343	29319343	0,0,0	8	*801,081,141,261,851,000,000,000*	*0,2673,5119,6200,7675,10096,12006,49614*
1	16480949	16503517	circular_RNA/2	0	+	16480949	16480949	0,0,0	10	*170,2034,96,186,153,3267,230,3709,131,141*	*0,604,2855,6138,7632,8836,13701,15411,20947,22427*

Thank you for any help you can provide on how to interpret these values.

kepbod commented 5 years ago

I just came back from my vacation, sorry for the late response. Did the spurious data come out when you open the file using Excel or similar software? It should be the convert problem of Excel. In the txt file, the numbers are correct. Please double check.

1   29319343    29369105    circular_RNA/2  1   +   29319343    29319343    0,0,0   8   80,108,114,126,185,1010,169,148 0,2673,5119,6200,7675,10096,12006,49614 2   circRNA CUFF.4319.3 CUFF.4319.3 4,5,6,7,8,9,10,11   1:29295801-29319343|1:29369105-29377159

smahaffey commented 5 years ago

I'm so sorry, yes you are right. It looks correct as text. I was just trying to get it into columns to make it easier to look at for writing a script to parse and merge files across samples, but yes something happened when I opened it with excel. Thank you for your reply. I'm sorry for missing that and bothering you with it.

YangLab / CIRCexplorer2

circularRNA_full.txt - confusing exonCount/exonSize/exonOffsets #33