Closed vikasbansal closed 10 years ago
can you give me the id of the exon please.
I think I gave you the bam file for myh7 gene. In output file it should be "NM_080728". I used grep "NM_080728" output.tsv.
P.S: These 2 columns have same column sum.
I don't get it, can you show me the problem please.
exon.start exon.end exon.index5_3 exon.count_prev_and_next exon.count_prev_and_curr exon.count_curr_and_next exon.count_curr_only exon.count_others
55589525 55589647 1/41 0 0 994 553 17
55590214 55590348 2/41 0 994 17056 25526 5
55590462 55590557 3/41 2018 15037 139 129 6
55591525 55591800 4/41 2 2156 872 44690 0
55591908 55592033 5/41 721 153 6353 14070 0
55592139 55592342 6/41 260 6814 117 15962 0
55592595 55592903 7/41 258 119 229 83111 0
55593479 55593603 8/41 258 229 925 467 0
55593749 55593914 9/41 260 923 16625 9936 0
55594071 55594254 10/41 258 16627 2024 21436 0
55594716 55594912 11/41 258 2024 45 1653 0
55597420 55597538 12/41 259 44 15 322 0
55597704 55597830 13/41 259 15 1343 291 0
55598010 55598399 14/41 259 1343 67 7318 0
55599158 55599248 15/41 259 67 63 117 63
55601003 55601148 16/41 259 63 696 2111 0
55602079 55602255 17/41 279 676 125 524 65
55602380 55602622 18/41 279 125 51 7243 65
55603388 55603643 19/41 279 51 25 713 65
55603891 55604027 20/41 279 25 25 106 65
55604298 55604421 21/41 279 25 2032 5827 65
55604562 55604679 22/41 279 2032 92 1478 65
55605297 55605384 23/41 279 92 17 154 65
55605706 55605773 24/41 279 17 16 102 65
55606058 55606367 25/41 21 274 9 168 65
55607142 55607312 26/41 21 9 3191 10416 65
55607612 55607761 27/41 21 3191 3 10441 65
55607881 55607999 28/41 21 3 11 72 65
55608092 55608230 29/41 21 11 7 1412 65
55608815 55608918 30/41 22 6 45 507 77
55609142 55609240 31/41 22 45 67 603 71
55609579 55609642 32/41 0 89 35 708 0
55609753 55609845 33/41 0 35 46 313 0
55609929 55610037 34/41 0 46 154 623 0
55610553 55610580 35/41 0 8 448 392 146
55610675 55610831 36/41 0 594 92 5519 0
55611077 55611220 37/41 0 92 31 2618 0
55611510 55611718 38/41 0 31 43 293 0
55612141 55612196 39/41 0 43 51 535 0
55612425 55612462 40/41 0 48 19 524 3
55613348 55613386 41/41 0 22 0 137 0
I ran it on whole BAM file. here is the e.g.
chr10 3134304 3134909 . 1/13 NM_172546 . NM_172546 0 0 12 20 0 chr10 3185734 3185897 . 2/13 NM_172546 . NM_172546 0 12 3 33 0 chr10 3192056 3192258 . 3/13 NM_172546 . NM_172546 0 3 4 33 0 chr10 3193590 3193677 . 4/13 NM_172546 . NM_172546 0 4 13 8 0 chr10 3203539 3203580 . 5/13 NM_172546 . NM_172546 0 13 11 0 0 chr10 3206103 3206222 . 6/13 NM_172546 . NM_172546 0 11 22 50 0 chr10 3208127 3208186 . 7/13 NM_172546 . NM_172546 0 22 18 2 0 chr10 3211425 3211493 . 8/13 NM_172546 . NM_172546 0 18 42 14 0 chr10 3211873 3212019 . 9/13 NM_172546 . NM_172546 0 42 22 17 0 chr10 3217394 3217518 . 10/13 NM_172546 . NM_172546 0 22 45 24 0 chr10 3219698 3219906 . 11/13 NM_172546 . NM_172546 0 45 3 105 0 chr10 3220357 3220446 . 12/13 NM_172546 . NM_172546 0 3 15 22 0 chr10 3225955 3227479 . 13/13 NM_172546 . NM_172546 0 15 0 376 0 chr10 3308332 3309383 . 1/6 NM_001039652 . NM_001039652 0 0 0 2 0 chr10 3323356 3323399 . 1/8 NM_001033391 . NM_001033391 0 0 0 0 0 chr10 3332435 3332500 . 2/6 NM_001039652 . NM_001039652 0 0 0 0 0 chr10 3349965 3350011 . 2/8 NM_001033391 . NM_001033391 0 0 0 0 0 chr10 3366076 3366294 . 1/9 NM_001170800 . NM_001170800 0 0 0 0 0 chr10 3366076 3366294 . 1/7 NM_001170802 . NM_001170802 0 0 0 0 0 chr10 3366076 3366294 . 1/7 NM_001170801 . NM_001170801 0 0 0 0 0 chr10 3366837 3366925 . 3/6 NM_001039652 . NM_001039652 0 0 0 0 0 chr10 3366858 3367027 . 2/9 NM_001170800 . NM_001170800 0 0 0 0 0 chr10 3366858 3367027 . 2/7 NM_001170802 . NM_001170802 0 0 0 0 0 chr10 3366858 3367027 . 2/7 NM_001170801 . NM_001170801 0 0 0 0 0 chr10 3390409 3390482 . 3/8 NM_001033391 . NM_001033391 0 0 0 0 0 chr10 3390409 3390482 . 3/9 NM_001170800 . NM_001170800 0 0 0 0 0 chr10 3390409 3390482 . 3/7 NM_001170802 . NM_001170802 0 0 0 0 0 chr10 3390409 3390482 . 3/7 NM_001170801 . NM_001170801 0 0 0 0 0 chr10 3391800 3391871 . 4/8 NM_001033391 . NM_001033391 0 0 0 0 0 chr10 3391800 3391871 . 4/9 NM_001170800 . NM_001170800 0 0 0 0 0 chr10 3391800 3391871 . 4/7 NM_001170802 . NM_001170802 0 0 0 0 0 chr10 3391800 3391871 . 4/7 NM_001170801 . NM_001170801 0 0 0 0 0 chr10 3409382 3409440 . 5/8 NM_001033391 . NM_001033391 0 0 2 0 0 chr10 3409382 3409440 . 5/9 NM_001170800 . NM_001170800 0 0 2 0 0 chr10 3409382 3409440 . 5/7 NM_001170802 . NM_001170802 0 0 2 0 0 chr10 3409382 3409440 . 5/7 NM_001170801 . NM_001170801 0 0 2 0 0 chr10 3411266 3411354 . 6/8 NM_001033391 . NM_001033391 0 2 0 0 0 chr10 3411266 3411354 . 6/9 NM_001170800 . NM_001170800 0 2 0 0 0 chr10 3411266 3411354 . 6/7 NM_001170802 . NM_001170802 0 2 0 0 0 chr10 3411266 3411366 . 6/7 NM_001170801 . NM_001170801 0 2 0 0 0
Also the example you showed-
55594071 55594254 10/41 258 16627 2024 21436 0 55594716 55594912 11/41 258 2024 45 1653 0 55597420 55597538 12/41 259 44 15 322 0 55597704 55597830 13/41 259 15 1343 291 0 55598010 55598399 14/41 259 1343 67 7318 0 55599158 55599248 15/41 259 67 63 117 63 55601003 55601148 16/41 259 63 696 2111 0 55602079 55602255 17/41 279 676 125 524 65 55602380 55602622 18/41 279 125 51 7243 65 55603388 55603643 19/41 279 51 25 713 65 55603891 55604027 20/41 279 25 25 106 65 55604298 55604421 21/41 279 25 2032 5827 65 55604562 55604679 22/41 279 2032 92 1478 65 55605297 55605384 23/41 279 92 17 154 65 55605706 55605773 24/41 279 17 16 102 65
If you will notice for e.g. 1343,67,63
****EDIT****** It seems like "exon.count_prev_and_curr" has the same value as "exon.count_curr_and_next" in previous row.
Isn't it normal that the exon will have ~same value of the previous line for the exon+1 ?
Oh yes. You are right. But then question would be, why they are different in some cases?
_EDIT_ Do you count junction read if its part is completely with in annotated exon?
no time to check this now. I suggest you to convert a given region to bed using bamToBed -bed12
and open a custom track in the ucsc to view the reads.
Ok. Thanks a lot. Just one question- for "exon.count_prev_and_next" (supports skipping of exon), do you consider only immediate previous (upstream) and next (downstream) exon?
all downstream and upstream exon are considered: see https://github.com/lindenb/jvarkit/blob/6294613b8eff3419427ddff13a37a84fcffaba21/src/main/java/com/github/lindenb/jvarkit/tools/biostar/Biostar103303.java#L349 and https://github.com/lindenb/jvarkit/blob/6294613b8eff3419427ddff13a37a84fcffaba21/src/main/java/com/github/lindenb/jvarkit/tools/biostar/Biostar103303.java#L356
Dear Pierre,
I have noticed that these 2 columns give same values with shift in one row. For e.g. "exon.count_prev_and_curr" has values- 0,12,3,4,3,11,22 and "exon.count_curr_and_next" has values- 12,3,4,3,11,22,0.
Best wishes, Vikas