let decomposed = nfd(cps);
for (let i = 1, e = decomposed.length; i < e; i++) {
if (CM.has(cps[i])) {
let j = i + 1;
while (j < e && CM.has(cps[j])) j++;
if (j - i > M) {
throw new Error(`too many combining marks: ${g.N} "${str_from_cps(cps.slice(i-1, j))}" (${j-i}/${M})`);
}
i = j;
}
}
I think this is incorrect because you are creating decomposed and then not using it for indexing.
I made a python implementation of this algorithm and this line would throw an index error.
If I change cps[i] to decomposed[i] it works but returns different results for this label:
ك\u0622\u064D\u064Dك
ENS Resolver says it's normalized but my code returns multiple combining marks error.
All normalization/beautification/tokenization tests pass in my implementation.
I think your js implementation quietly ignores the out of bounds error because js returns undefined when indexing past the array.
In
check_group()
:I think this is incorrect because you are creating
decomposed
and then not using it for indexing. I made a python implementation of this algorithm and this line would throw an index error. If I changecps[i]
todecomposed[i]
it works but returns different results for this label:ك\u0622\u064D\u064Dك
ENS Resolver says it's normalized but my code returns multiple combining marks error. All normalization/beautification/tokenization tests pass in my implementation. I think your js implementation quietly ignores the out of bounds error because js returns undefined when indexing past the array.