How to configure the correct /optimal parameter and validate the true positive from PB fusion candidate list.

Hi, Magdoll, I have download the fusion fasta file and used the same commands and parameters at (https://github.com/Magdoll/cDNA_Cupcake/wiki/Best-practice-for-fusion-transcript-finding) 1) minimap2 against hg38 & sort 2.) fusion_finder.py , 3.) SQANTI (gencode.v33) , 4.) fusion_collate_info.py. However, in the final annotation result files (output.fusion.annotated.txt + output.fusion.annotated_ignored.txt, or the output.fusion.gff), I could not find the mentioned PBfusion.142 with the break point at the close position at chr1:6825211. (Please you correct me if I used the wrong analyzing procedure) I also adopted the close procedure and run several rounds to my own data, and generated several PBfusion candidates. However, we do not know whether they are just false positive (many of them are mapped to intron regions). I am also puzzled to configure the optimal parameters for fusion_finder.py ( -c as 0.05 for per locus coverage , -t as 0.98 for total coverage, -d as 10K for distance) . In short, I am not clear on how to configure the correct /optimal parameter and also do not know the procedure to validate/figure out the true positive from PB fusion candidate list. Sorry for so many questions. Many thanks in advance. Wenchao

Magdoll / cDNA_Cupcake

How to configure the correct /optimal parameter and validate the true positive from PB fusion candidate list. #226