jts / sga

de novo sequence assembler using string graphs
http://genome.cshlp.org/content/22/3/549
237 stars 82 forks source link

Fixes of minor issues that are relevant when correcting with large k-mer sizes and low coverage regions #93

Closed ktrns closed 9 years ago

ktrns commented 9 years ago

1) k-mer correction (commit: 21.8.15)

We introduced one new parameter: Parameter '-O, --count-offset' is to only correct a base into a new base, if the resulting new k-mer has a count that is N higher than the old k-mer (default 1). Previously, a k-mer of count 1 could have been corrected into another k-mer of count 1.

2) overlap correction (commit: 3.12.14 and 5.12.14)

We introduced two new parameters: Parameter '-X, --base-threshold' is to only correct bases in a read that are seen less than N times (default 2). Parameter '-M, --min-count-max-base' is to only correct a base into a new base, if the new base is seen at least N times (default 4). Additionally there should only be one valid base flip.