HudsonAlpha / fmlrc2

Apache License 2.0
43 stars 5 forks source link

Shortend read length after running FMLRC2 #32

Closed juheon closed 9 months ago

juheon commented 10 months ago

Hello,

I am using FMLRC2 using de bruin graph from RNA-seq to correct PacBio Iso-Seq data to see their impact. I found small fraction of PacBio Iso-seq reads are shortened >500bp. Could I turn off this significant shortening behavior of FMLRC2?

holtjma commented 10 months ago

Hello,

The short answer is no, there's no way to restrict the corrected sequence by length.

While FMLRC will probably work for certain RNA-seq sets, we did not design with RNA in mind. I would guess that there are isoforms present in the PacBio data that are not in the RNA-seq, and that it is "correcting" those isoforms out of your dataset. In this regard, FMLRC is doing what it's supposed to do, but it's not well suited for this application.

As an aside, I checked with my PacBio colleagues who work with Iso-seq more. If you're using HiFi reads, read correction is not currently recommended as the impact from removing true isoforms is far more deleterious than any base-level correction you might gain.

Matt

holtjma commented 9 months ago

Closing due to inactivity.