Analyse the frequency of tokens in data. This is to provide the user with insights into the data.
Examples are frequency of: words, user mentions, emojis, base URLs, hashtags etc.
This can be done using winkjs as a tokenizer and categorizer.
The logic should take as its input a collection of documents (string []) and return an object of frequency tables, something like this:
interface FrequencyTables {
[tokenName: string]: { // tokenName can be something like "words", "hashtags" or "emojis"
[tokenValue: string]: number // where number is the count of occurrances of tokenValue
}
}
The logic could also take as input a limit of the top X results to include. Instead of returning the frequency of every single token, rather only return the 20 most used tokens or something like that, but the cut off should be a parameter rather than hard coded.
Analyse the frequency of tokens in data. This is to provide the user with insights into the data.
Examples are frequency of: words, user mentions, emojis, base URLs, hashtags etc.
This can be done using
winkjs
as a tokenizer and categorizer.The logic should take as its input a collection of documents (
string []
) and return an object of frequency tables, something like this:The logic could also take as input a limit of the top X results to include. Instead of returning the frequency of every single token, rather only return the 20 most used tokens or something like that, but the cut off should be a parameter rather than hard coded.