Closed sageserpent-open closed 5 months ago
Also, it's worth describing some of the blind alleys taken - the genetic algorithm, token-based merging.
Also why Rabin fingerprinting has been replaced by simpler rolling hashing.
Three-way partitioning deserves its obituary too.
Not doing two scans for matches should be explained.
The attempts at dealing with matches for single elements, digraphs and trigraphs should also be mentioned.
Breaking down sections into one-token sections when filling gaps.
Tweaking the metric used by LongestCommonSubsequence
.
Allowing ambiguity in LongestCommonSubsequence
to look for alternative, better three-way merges.
Done!
What is says on the tin.
Describe: