nf-core / rnafusion

RNA-seq analysis pipeline for detection of gene-fusions
https://nf-co.re/rnafusion
MIT License
142 stars 94 forks source link

Use references from CTAT #438

Open rannick opened 11 months ago

rannick commented 11 months ago

Description of feature

Use references from CTAT. Allows to shorten reference building time and maybe call a few more fusions with STAR-fusion as it is optimised for this

fevac commented 10 months ago

I was wondering about this change. The clear benefit is the short reference building time but are the CTAT genomes up-to-date? I don't see many updates coming from them. Is that a problem?

fevac commented 9 months ago

The ensembl annotation is missing the superlocus IGH IGL so including the following in the reference gtf might improve their detection with starfusion and arriba

chr14   SuperLocus-ext  exon    105583731   106875071   .   -   .   gene_id "IGH.g@-ext"; transcript_id "IGH.t@-ext"; gene_name "IGH@-ext";
chr14   SuperLocus-ext  exon    105583731   106875071   .   +   .   gene_id "IGH-.g@-ext"; transcript_id "IGH-.t@-ext"; gene_name "IGH-@-ext";
chr22   SuperLocus-ext  exon    22030934    22923034    .   +   .   gene_id "IGL.g@-ext"; transcript_id "IGL.t@-ext"; gene_name "IGL@-ext";
chr22   SuperLocus-ext  exon    22030934    22923034    .   -   .   gene_id "IGL-.g@-ext"; transcript_id "IGL-.t@-ext"; gene_name "IGL-@-ext";