Due to the nature of long-reads 3'UTR profiles (see image below), the breakpoint detection of DaPars2 is not accurate : it assumes uniform distribution before and after breakpoint, while long-reads have this slope of decreasing coverage.
For the example of COL6A2 below, DaPars2 will find:
Gene
fit_value
Predicted_Proximal_APA
Loci
Red
Green
ENST00000361866.8|COL6A1|chr21|+
1298.1
46003542
chr21:46003391-46005048
1.00
1.00
While the correct answer should be something like:
Gene
fit_value
Predicted_Proximal_APA
Loci
Red
Green
ENST00000361866.8|COL6A1|chr21|+
XXXX
46004100
chr21:46003391-46005048
0.90
0.70
So it finds an incorrect breakpoint, leading to incorrect DPUI values.
While DaPars2 does not claim to work for long-reads, I thought it would be nice to have a version working for it.
Dear DaPars2 developers,
Due to the nature of long-reads 3'UTR profiles (see image below), the breakpoint detection of DaPars2 is not accurate : it assumes uniform distribution before and after breakpoint, while long-reads have this slope of decreasing coverage.
For the example of COL6A2 below, DaPars2 will find:
While the correct answer should be something like:
So it finds an incorrect breakpoint, leading to incorrect DPUI values.
While DaPars2 does not claim to work for long-reads, I thought it would be nice to have a version working for it.