XueyiDong / LongReadBenchmark

Benchmarking long-read RNA-seq analysis tools
MIT License
24 stars 2 forks source link

questions about precision and recall #1

Closed defendant602 closed 11 months ago

defendant602 commented 11 months ago

Hi Xueyi,

Nice research work you've done. I have a liittle question about precision and recall of isoform detection. How did you calculate the precision and recall rates of different tools in isoform detection? If there are only a few basepairs differences for an isoform with the ground truth transcript in the exon-intron boundaries, will it be considered to be a true positive?

XueyiDong commented 11 months ago

Hi @defendant602 ,

Thanks for reading our paper and thanks for your question!

Most isoform detection tools name known genes and isoforms using their original names. For all the isoform detection tools we tested except for Cupcake, we considered those named by known isoforms as true positives, while the remaining were false positives. For Cupcake, we used the SQANTI classification output. You can look at this script for details:

https://github.com/XueyiDong/LongReadBenchmark/blob/834623bf32a7441659a233084b76e3b5d5468695/ONT/isoform_detection/analysis/sequins_analysis.R#L62-L77

Best, Xueyi

defendant602 commented 11 months ago

Thanks for your quick reply, Xueyi. I understood. Closing this issue...