Tracking Issue: Remove index/count confusion.

When diffs report deleted or equivalent characters they report a span length as a count of characters; unfortunately there is no common definition of character.

diff-match-patch could resolve this in a backwards-compatible way by adding a preamble to its patches that indicates which definition is in use, through the use of semantically empty diff groups.

For example, a leading group of zero length or an empty insert operation should have no impact on the diffed files, so may be used to communicate very small amounts of information.

Consider:

EQUAL(0), EQUAL(0), ...rest of diff indices/counts represent UTF-16 code units.
EQUAL(0), ...rest of diff indices/counts represent Unicode code points.
...rest of diff indicates that indices/counts represent whatever they did before in their respetive libraries.

A new parameter to the diffing functions can set a mode so that clients can request specific counts. For example, diff_main(a, b, {units: 'unicode'})

dmsnell / diff-match-patch

Tracking Issue: Remove index/count confusion. #4