I'm having an issue with certain documents. Depending on their content, there could many, many numbers in the resulting word cloud. The numbers are repeated a lot, but they don't really mean anything. They could be page number references, policy names, ect. We can't just add every number in the universe to the common word list aka ignore list like we could with some words, i.e. "the, of, and, a, as, ...". The permutations are too high, there could be a huge exponential number of different combinations of numbers. How can we make sure the cloud has no numbers?
Ideally, there would be built-in functionality for this. I can only speak for myself, but I have virtually 0 knowledge of regex in javascript.
I'm having an issue with certain documents. Depending on their content, there could many, many numbers in the resulting word cloud. The numbers are repeated a lot, but they don't really mean anything. They could be page number references, policy names, ect. We can't just add every number in the universe to the common word list aka ignore list like we could with some words, i.e. "the, of, and, a, as, ...". The permutations are too high, there could be a huge exponential number of different combinations of numbers. How can we make sure the cloud has no numbers?
Ideally, there would be built-in functionality for this. I can only speak for myself, but I have virtually 0 knowledge of regex in javascript.