pipasK commented 6 years ago

I am running a STARchim fusion and when I use default parameters, the run is successful but no fusions are detected. However I get the message: ”No fusions discovered. Consider lowering read requirements to increase sensitivity.” Would you let me know what is the best away to teak the “read parameters” to improve sensitivity? Because some of my samples have significant level of CNVs I have adjusted the parameters adjusted splitReads = highsensitivity , wiggleroom=250, samechrom_wiggle = 0, cnvwiggle =200. But I get no fusions (not all CNVs will result on fusions). If however I adjust the (Read Distribution upper limit: 100 X) lopsidedupper = 100, lopsidedlower = 0.1, fusions (n=4) are identified. Would this make sense?

Any comments are appreciated!

RUN

/data2/syspharm/projects/FUSION/STARChip/starchip/starchip-fusions.pl Fusion /data2/syspharm/projects/RNAseq/150512_D00190_0260_BC76WUACXX/primary/STAR_cluster/star_output/2042A/2042A_Chimeric.out.junction /data2/syspharm/projects/FUSION/STARChip/STARout/star_output/hg38.params.txt

PARAMETERS

splitReads = highsensitivity Paired-End: TRUE Split Reads Cutoff: 5 Unique Support Values Min: 2 Spanning Reads Cutoff: 1 Location Wiggle Room (spanning reads): 250 bp Location Wiggle Room (split reads) : 5 bp Min-distance : 0 bp Read Distribution upper limit: 100 X Read Distribution lower limit: 0.1 X

kippakers commented 6 years ago

Hi PipasK,

Thanks for using STARChip! Based on the read depth and you selecting highsensitivity, STARChip has selected 5 reads as the cutoff for filtering fusions. You can lower that to 2-4, or lower the unique support values minimum to 1.

It looks like a some fusions are being filtered for having strand imbalance. So when you raised the lopsidedupper to 100, you got 4 fusions. This is fine, but these should be inspected by hand. Additionally, lopsidedupper and lopsidedlower should be X and 1/X, so change lopsidedlower to 0.01. I should also note that if you data is stranded, you should turn these values off completely by setting lopsidedupper = 10000000 and lopsidedlower = 0.

Overall, increasing sensitivity is always going to cost you a higher risk of false positives. Just be sure to look at these fusions with extra scrutiny. Good luck!

pipasK commented 6 years ago

Hi Nicholas, Thank you for your input! I will rerun STARchim fusion with the suggested parameters. Also, for samples with high CNV level is it fine to decrease the wiggleroom, samechrom_wiggle and cnvwiggle? Thank you.

kippakers commented 6 years ago

You probably know a lot more about CNVs than I do, so I'll leave judgement up to you. But the value for "wiggle" is just how close a fusion has to be to merge it with another fusion when counting split, paired end reads. For paired-end reads, this comes in to play because the ends of a paired end read might not actually cover the fusion site. So STAR will guess, but it it's not a precise guess obviously. Making it smaller will decrease your sensitivity, but probably not affect much if you're only requiring 1 spanning read.

samechrom_wiggle has to do with read-through transcripts. Often, you'll see a read that splices forward 5, 10KB, and it's more likely to be a splicing error than a chromosomal rearrangement. This value is the minimum distance between fusion pair sites on the same chromosome. I would think with lots of CNVs you may want to increase this value, to reduce the likelihood of calling a long stretch of CNVs a fusion.

cnvwiggle works this way: STARChip excludes fusions that sit in known CNV regions. It does this by taking an index of known CNVs, then comparing those locations to the location of the called fusions. If it's within cnvwiggle base pairs, it's excluded. Lowering this may result in more CNVs called as fusions.

LosicLab / starchip

STAR chip-fusion: no fusions are detected #18

RUN

PARAMETERS