jasondavies / d3-cloud

Create word clouds in JavaScript.
https://www.jasondavies.com/wordcloud/
Other
3.82k stars 1.07k forks source link

Ignore Number Functionality // regex? #126

Closed diggetybo closed 6 years ago

diggetybo commented 7 years ago

I'm having an issue with certain documents. Depending on their content, there could many, many numbers in the resulting word cloud. The numbers are repeated a lot, but they don't really mean anything. They could be page number references, policy names, ect. We can't just add every number in the universe to the common word list aka ignore list like we could with some words, i.e. "the, of, and, a, as, ...". The permutations are too high, there could be a huge exponential number of different combinations of numbers. How can we make sure the cloud has no numbers?

Ideally, there would be built-in functionality for this. I can only speak for myself, but I have virtually 0 knowledge of regex in javascript.