GoekeLab / bambu

Reference-guided transcript discovery and quantification for long read RNA-Seq data
GNU General Public License v3.0
189 stars 24 forks source link

Low increase in discovered isoforms with RNA004 data #443

Open dietvin opened 2 months ago

dietvin commented 2 months ago

Hi, we used Bambu for transcript discovery on RNA002 and RNA004 data of the same sample. As expected the new chemistry produced a lot more reads (roughly 4x depth), but Bambu only identified roughly 10 novel isoforms that were not identified with the RNA002 data. We were expecting larger differences and wonder whether this is due to wrong usage of Bambu or if it's just due to the underlying data?

I used the default parameters to run Bambu:

se = bambu(reads = bam.paths, annotations = annotations, genome = fa.path, quant = TRUE, ncore = 16)

Any help would be appreciated. Thank you. Vincent

andredsim commented 2 months ago

Hi Vincent,

Thanks for your report. This is actually what we would expect, as we designed Bambu to be robust to changes in read depth. Higher read depth can lead to the identification of new transcripts, it disproportionately also increases the amount of false positives. If you are confident in the quality of your sample and would like to discover more transcripts, I would recommend increasing the NDR parameter which will increase the number of discovered isoforms. Check the output of your last run for the NDR threshold it applied (it would be around 0.1). A first attempt would be increasing it by 0.1 to 0.2.

I hope this helps, Andre Sim