mnater / Hyphenopoly

Hyphenation for node and Polyfill for client-side hyphenation.
http://mnater.github.io/Hyphenopoly/
MIT License
694 stars 44 forks source link

SIGBUS crash while hyphenating english after some Russian words #169

Closed lehni closed 3 years ago

lehni commented 3 years ago

I've created a simple test-case that reproduces the problem:

https://github.com/lehni/hyphenopoly-crash

When I run this simple program, I get an instant crash:

import hyphenopoly from 'hyphenopoly'

const hyphenate = await hyphenopoly.config({ require: ['ru'] })

console.log(hyphenate('хакерські'))
console.log(hyphenate('hacking'))

Output:

yarn start
yarn run v1.22.10
$ node index.js
хакерські
error Command failed with signal "SIGBUS".

I've tested both with Node v14 and v16, same result.

mnater commented 3 years ago

Hi

Thanks for reporting. I‘m currently on vacation. I‘ll check this out as soon as I‘m back.

mnater commented 3 years ago

This was a tricky bug. But I think I solved the riddle.

My code failed at two stages: 1: When checking for characters that are not in the alphabet, the module used the global flag and thus was incrementing the lastIndexof the RegExp-object. When tow words with "unknown" characters are hyphenated and the second word is shorter than the first word, the RegExp searches beyond the end of the word and doesn't recognize the second word as not hyphenatable by the patterns.

2: The WebAssembly-hyphenate function should return 0, when a word containing a "unknown" character is passed to it. There was a bug that has been solved (see issue #165) but has not been rolled out, yet.

I'll fix this and make a new release. Thanks again for reporting.

lehni commented 3 years ago

That's great to hear, thank you @mnater !