Open giorgio79 opened 6 years ago
Example: Would be nice to have an option that preserves punctuation:
console.log(nautral_NGrams.bigrams('Some, words here!!')); [ [ 'Some', 'words' ], [ 'words', 'here' ] ]
I would have liked to see [ [ 'Some,', 'words' ], [ 'words', 'here!!' ] ]
[ [ 'Some,', 'words' ], [ 'words', 'here!!' ] ]
If chaining commands is implemented eventually at https://github.com/NaturalNode/natural/issues/439 than one could just strip punctuation previously, or pass in to tokenizator first.
Also, tokenizers already split the text in various ways, so I would just keep the splitting logic with the tokenizers...
Example: Would be nice to have an option that preserves punctuation:
I would have liked to see
[ [ 'Some,', 'words' ], [ 'words', 'here!!' ] ]
If chaining commands is implemented eventually at https://github.com/NaturalNode/natural/issues/439 than one could just strip punctuation previously, or pass in to tokenizator first.