Indexing past end of array

Carbon225 commented 1 year ago

In check_group():

let decomposed = nfd(cps);
for (let i = 1, e = decomposed.length; i < e; i++) {
  if (CM.has(cps[i])) {
    let j = i + 1;
    while (j < e && CM.has(cps[j])) j++;
    if (j - i > M) {
      throw new Error(`too many combining marks: ${g.N} "${str_from_cps(cps.slice(i-1, j))}" (${j-i}/${M})`);
    }
    i = j;
  }
}

I think this is incorrect because you are creating decomposed and then not using it for indexing. I made a python implementation of this algorithm and this line would throw an index error. If I change cps[i] to decomposed[i] it works but returns different results for this label: ك\u0622\u064D\u064Dك ENS Resolver says it's normalized but my code returns multiple combining marks error. All normalization/beautification/tokenization tests pass in my implementation. I think your js implementation quietly ignores the out of bounds error because js returns undefined when indexing past the array.

adraffy commented 1 year ago

That's a great catch, thank you! I'll make sure there's a longer CM counting test.

Yes, there's probably a bunch of places I skip bound checks.

adraffy commented 1 year ago

This is fixed in 1.8.10. Thanks again.

adraffy / ens-normalize.js

Indexing past end of array #13