mckoss / dawg

Directed Acyclic Word Graph
MIT License
41 stars 2 forks source link

Words with underscores are not found #4

Open flekschas-ozette opened 2 years ago

flekschas-ozette commented 2 years ago

Words with underscores are reported as not present in a word collection with isWord

const trie = new Trie(['yes', 'yes_no']);
trie.isWord('yes'); // => true (which is correct)
trie.isWord('no'); // => false (which is correct)
trie.isWord('yes_no'); // => false (which is incorrect)

I guess this is because Node has special properties like _c that use _ and hence any prop starting with _ is excluded in node.props() https://github.com/mckoss/dawg/blob/master/src/node.ts#L88.

I guess a safer option would be to store props in their own dictionary like node.props. Otherwise the only option I can think of is to replace _ prior to creating the Trie and before isWord but that seems less ideal.

flekschas-ozette commented 2 years ago

It's a fairly simple fix and I got it running already in a local branch. I am happy to set up a PR.

mckoss commented 2 years ago

I'd be happy to review your PR. It's been a while so I don't recall why I reserved some characters - I do remember that this was only designed to work with plain alphabetics (7-bit ascii characters) in words; so there could be multiple issues with introducing other characters out of that range.