STAR-Fusion / STAR-Fusion

STAR-Fusion codebase
BSD 3-Clause "New" or "Revised" License
228 stars 80 forks source link

How can I extract fusion gene transcript sequence by the information of star-fusion.fusion_candidates.preliminary file #221

Open Caizhengwen123 opened 4 years ago

Caizhengwen123 commented 4 years ago

Hello, I want extract the fusion gene transcricpt sequrnce with the information of two gene name and the breakpoint from star-fusion.fusion_candidates.preliminary file. I've tried the FusionInspector/util/fusion_pair_to_mini_genome_join.pl script to do this work, but it reported these problems: Error, didn't extract required sequence from /.../ctat_genome_lib_build_dir/ref_genome.fa, NC_001526.2, -135, 3813, instead got seq of length 0 at /STAR-Fusion/FusionInspector/util/fusion_pair_to_mini_genome_join.pl line 453 ....... Error, didn't extract required sequence from /.../ctat_genome_lib_build_dir/ref_genome.fa, NC_001526.2, -917, 1559, instead got seq of length 0 at /STAR-Fusion/FusionInspector/util/fusion_pair_to_mini_genome_join.pl line 453. Then I opened fusion_pair_to_mini_genome_join.pl try to find some way solve it, and found the $chr variate only gave NC_001526.2 value to extract the sequence, but I don't know how to fix it. Could you spare a few moments from work to help me solve this problem, thanks a lot! Zhengwen CAI

brianjohnhaas commented 4 years ago

Hi,

STAR-Fusion doesn't currently include that info. Leave this ticket open and I'll include a utility for it in the next release.

In the meantime, if you include the parameter --examine_coding_effect, it'll provide fusion sequences for candidate fusion transcripts involving coding exons.

On Thu, Aug 20, 2020 at 4:59 AM ZhengWen Cai notifications@github.com wrote:

Hello, I want extract the fusion gene transcricpt sequrnce with the information of two gene name and the breakpoint from star-fusion.fusion_candidates.preliminary file. I've tried the FusionInspector/util/fusion_pair_to_mini_genome_join.pl script to do this work, but it reported these problems: Error, didn't extract required sequence from /.../ctat_genome_lib_build_dir/ref_genome.fa, NC_001526.2, -135, 3813, instead got seq of length 0 at /STAR-Fusion/FusionInspector/util/ fusion_pair_to_mini_genome_join.pl line 453 ....... Error, didn't extract required sequence from /.../ctat_genome_lib_build_dir/ref_genome.fa, NC_001526.2, -917, 1559, instead got seq of length 0 at /STAR-Fusion/FusionInspector/util/ fusion_pair_to_mini_genome_join.pl line 453. Then I opened fusion_pair_to_mini_genome_join.pl try to find some way solve it, and found the $chr variate only gave NC_001526.2 value to extract the sequence, but I don't know how to fix it. Could you spare a few moments from work to help me solve this problem, thanks a lot! Zhengwen CAI

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas

Caizhengwen123 commented 4 years ago

Thanks for your help and I will try it !

Caizhengwen123 commented 4 years ago

Hello, I've just tried your suggestion and add the parameter --examine_coding_effect, but there is no fusion sequences for candidate in the result files, only a star-fusion.fusion_predictions.abridged.coding_effect.tsv file was added, and no error reported, could you help me find what wrong with my work? Thanks a lot!

brianjohnhaas commented 4 years ago

if the fusions don't involve coding exons in both cases (ie. instead involve UTR exons or are noncoding transcripts), then it won't try to guess a reconstructed sequence.

I'll get a script together that'll report the fusion sequences shortly. Stay tuned.

~b

On Thu, Aug 20, 2020 at 11:09 PM ZhengWen Cai notifications@github.com wrote:

Hello, I've just tried your suggestion and add the parameter --examine_coding_effect, but there is no fusion sequences for candidate in the result files, only a star-fusion.fusion_predictions.abridged.coding_effect.tsv file was added, and no error reported, could you help me find what wrong with my work? Thanks a lot!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/STAR-Fusion/STAR-Fusion/issues/221#issuecomment-678013044, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX5QU43QLWJ3HCL5XETSBXQP3ANCNFSM4QF2CX2Q .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

brianjohnhaas commented 4 years ago

Here's a script you can use for adding the fusion breakpoint sequence to your star-fusion report:

https://github.com/STAR-Fusion/STAR-Fusion/blob/devel/util/append_fusion_brkpt_adjacent_sequence.pl

Just drop it into your STAR-Fusion/util/ folder.

usage like so:

util/append_fusion_brkpt_adjacent_sequence.pl star-fusion.fusion_predictions.abridged.tsv $CTAT_GENOME_LIB/ 200

Caizhengwen123 commented 4 years ago

Many thanks with your help, now I've run your script and get tow fusion sequence of nine fusion gene, maybe there are UTR exons or noncoding transcripts, I will check these cases, Thanks!

eseffar commented 3 years ago

Here's a script you can use for adding the fusion breakpoint sequence to your star-fusion report: https://github.com/STAR-Fusion/STAR-Fusion/blob/devel/util/append_fusion_brkpt_adjacent_sequence.pl Just drop it into your STAR-Fusion/util/ folder. usage like so: util/append_fusion_brkpt_adjacent_sequence.pl star-fusion.fusion_predictions.abridged.tsv $CTAT_GENOME_LIB/ 200

Hello, I use the perl script above to get fusion sequence and it seems to works but I obtain two sequence for each fusion, one in uppercase and another in lowercase like this :

FusionSeq
TGTGTTTCCTGAGACCTCCCAGCCACGCTTCCTGTACAGCCTGCAGAACTgtaatcattaccaaatgaggaggaaaggacgatgtcatcgaggctctgca
CCTTGGACCCCGCGGCGCCCCTCGGCCTCGGAGCAACGAGCGCAGCGCCGagcggttctggtgaggttggaaggatgcagccacccggttgtcatgaaag
GTGAAAGCGAAGAAGAAACACACTAACGCAGAAAAAAAGTTGGCAGACAGttgcaagaacagcaaggaaaaggaagccctctccagaaccagaaggtgaa

Is it normal ?

I would like to use these sequence to validate my fusion by PCR.

Thank you in advance for your answer.

brianjohnhaas commented 3 years ago

yes, I believe the case is to indicate which fusion partner the sequence was derived from on either side of the breakpoint.

On Wed, Jun 2, 2021 at 5:25 AM Evan-SF @.***> wrote:

Here's a script you can use for adding the fusion breakpoint sequence to your star-fusion report: https://github.com/STAR-Fusion/STAR-Fusion/blob/devel/util/append_fusion_brkpt_adjacent_sequence.pl Just drop it into your STAR-Fusion/util/ folder. usage like so: util/ append_fusion_brkpt_adjacent_sequence.pl star-fusion.fusion_predictions.abridged.tsv $CTAT_GENOME_LIB/ 200

Hello, I use the perl script above to get fusion sequence and it seems to works but I obtain two sequence for each fusion, one in uppercase and another in lowercase like this :

FusionSeq TGTGTTTCCTGAGACCTCCCAGCCACGCTTCCTGTACAGCCTGCAGAACTgtaatcattaccaaatgaggaggaaaggacgatgtcatcgaggctctgca CCTTGGACCCCGCGGCGCCCCTCGGCCTCGGAGCAACGAGCGCAGCGCCGagcggttctggtgaggttggaaggatgcagccacccggttgtcatgaaag GTGAAAGCGAAGAAGAAACACACTAACGCAGAAAAAAAGTTGGCAGACAGttgcaagaacagcaaggaaaaggaagccctctccagaaccagaaggtgaa

Is it normal ?

Thank you in advance for your answer.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/STAR-Fusion/STAR-Fusion/issues/221#issuecomment-852863732, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX2QPW55S3JDYQ7ME53TQX2JDANCNFSM4QF2CX2Q .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

eseffar commented 3 years ago

yes, I believe the case is to indicate which fusion partner the sequence was derived from on either side of the breakpoint.

Oh okay ! To be sure I understand well, the first sequence in upperscase is supposed to be the gene1 and the second sequence in lowercase is the gene2 ?

Thank you !

brianjohnhaas commented 3 years ago

That's what I expect. Please verify this is the case, though.

On Wed, Jun 2, 2021 at 7:44 AM Evan-SF @.***> wrote:

yes, I believe the case is to indicate which fusion partner the sequence was derived from on either side of the breakpoint. … <#m-3273865220436300811> On Wed, Jun 2, 2021 at 5:25 AM Evan-SF @.***> wrote: Here's a script you can use for adding the fusion breakpoint sequence to your star-fusion report: https://github.com/STAR-Fusion/STAR-Fusion/blob/devel/util/append_fusion_brkpt_adjacent_sequence.pl Just drop it into your STAR-Fusion/util/ folder. usage like so: util/ append_fusion_brkpt_adjacent_sequence.pl star-fusion.fusion_predictions.abridged.tsv $CTAT_GENOME_LIB/ 200 Hello, I use the perl script above to get fusion sequence and it seems to works but I obtain two sequence for each fusion, one in uppercase and another in lowercase like this : FusionSeq TGTGTTTCCTGAGACCTCCCAGCCACGCTTCCTGTACAGCCTGCAGAACTgtaatcattaccaaatgaggaggaaaggacgatgtcatcgaggctctgca CCTTGGACCCCGCGGCGCCCCTCGGCCTCGGAGCAACGAGCGCAGCGCCGagcggttctggtgaggttggaaggatgcagccacccggttgtcatgaaag GTGAAAGCGAAGAAGAAACACACTAACGCAGAAAAAAAGTTGGCAGACAGttgcaagaacagcaaggaaaaggaagccctctccagaaccagaaggtgaa Is it normal ? Thank you in advance for your answer. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#221 (comment) https://github.com/STAR-Fusion/STAR-Fusion/issues/221#issuecomment-852863732>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX2QPW55S3JDYQ7ME53TQX2JDANCNFSM4QF2CX2Q . -- -- Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

Oh okay ! To be sure I understand well, the first sequence in upperscase is supposed to be the gene1 and the second sequence in lowercase is the gene2 ?

Thank you !

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/STAR-Fusion/STAR-Fusion/issues/221#issuecomment-852958028, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX6W2QKDRIYKP746SZLTQYKSHANCNFSM4QF2CX2Q .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas