maickrau / GraphAligner

MIT License
256 stars 30 forks source link

Interpretation of log output on aligned base pairs #35

Closed tobiasmarschall closed 3 years ago

tobiasmarschall commented 3 years ago

Hi Mikko,

we were unsure how to interpret the log out put with respect to the number of bp aligned:

Input reads: 1689007 (43919966320bp)
...
Alignments: 6934171 (42899288374bp) (39634694 additional alignments discarded)

Can the alignments overlap in the space of individual reads? That is, is it valid to divide the bp in input reads by the bp aligned to get the fraction of basepairs that in the input that are involved in alignments? (Sorry if I just missed that. But might be useful for users to hint at this in the output).

Cheers, Tobias

maickrau commented 3 years ago

The alignments can overlap for individual reads. Dividing the aligned bp by input bp will be an overestimate in that case. Usually the alignments shouldn't overlap by too much unless "--all-alignments" is set so it will be only a slight overestimate.

tobiasmarschall commented 3 years ago

Ok, thanks for clarifying this.

ekg commented 3 years ago

The rs-peanut tool used in pgge can correct for this effect.

On Thu, Mar 4, 2021, 06:13 Tobias Marschall notifications@github.com wrote:

Ok, thanks for clarifying this.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/maickrau/GraphAligner/issues/35#issuecomment-790297310, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEOFANPNTXNFVXQCUX3TB4JGLANCNFSM4YRPSTPQ .