STAR-Fusion / STAR-Fusion

STAR-Fusion codebase
BSD 3-Clause "New" or "Revised" License
228 stars 80 forks source link

Query about transcripts output in the coding effect results file #222

Closed pwanj closed 4 years ago

pwanj commented 4 years ago

Hello,

I have a fusion involving a gene which has multiple transcripts. I used the '--examine_coding_effect' parameter to get more information about the detected fusion. I see that the output also mentions 1 transcript per gene.

Example from the star fusion wiki page:

FusionName BCR--ABL1

... _CDS_LEFT_ID_ ENST00000305877.8 CDS_LEFTRANGE 1-2782 CDS_RIGHT_ID_ ENST00000318560.5 CDS_RIGHT_RANGE 80-3393

What is the significance of these transcripts output in the result file? Does it mean that only these transcripts are part of the putative gene fusion. and we should ignore other transcripts of the gene?

Please let me know if i misinterpreting the results file. I would appreciate your help.

Thanks!

brianjohnhaas commented 4 years ago

Hi,

When examining the coding region effect, it just tries to find a pair of isoforms that match up with the breakpoint. It tries all candidate isoform pairs and prioritizes those that provide in-frame coding regions. There's no guarantee that those represent the true fusion isoforms.... they're just a candidate model based on the fusion breakpoint identified. To get a better view of what fusion isoforms are best supported by the rna-seq data, you could run STAR-Fusion with the --FusionInspector validate mode and also enable Trinity reconstruction to de novo reconstruct fusion transcripts based on the rna-seq reads.

On Mon, Aug 24, 2020 at 6:21 PM Pankhuri Wanjari notifications@github.com wrote:

Hello,

I have a fusion involving a gene which has multiple transcripts. I used the '--examine_coding_effect' parameter to get more information about the detected fusion. I see that the output also mentions 1 transcript per gene.

Example from the star fusion wiki page:

FusionName BCR--ABL1

... CDS_LEFT_ID ENST00000305877.8 CDS_LEFT_RANGE 1-2782 CDS_RIGHT_ID ENST00000318560.5 CDS_RIGHT_RANGE 80-3393

What is the significance of these transcripts output in the result file? Does it mean that only these transcripts are part of the putative gene fusion. and we should ignore other transcripts of the gene?

Please let me know if i misinterpreting the results file. I would appreciate your help.

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/STAR-Fusion/STAR-Fusion/issues/222, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKXZA52FV5LV4WZUCXOLSCLRW3ANCNFSM4QJ7INEQ .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

pwanj commented 4 years ago

I see. Thank you for explaining this to me. I appreciate it.

Thanks!