(the code below should be nested here, I can't get it to go on this bulletpoint. HALP!)
<%J:
```javascript
/* hacked together script as proof of concept, I don't know JS, please improve! */
// get all blocks on current page by div class. can probably be refined
var allBlocks = document.querySelectorAll('.rm-blockinput.rm-blockinput--view.roam-block.dont-unfocus-block.hoverparent.rm-block-text');
// turn list into array in order to use .map() next (i think?)
var blockArray = [].slice.call(allBlocks);
// collapse innerText from all block divs into single string
var allText = blockArray.map(function(e){ return e.innerText; }).join(' ');
// split each word from string, creating array of words,
// could also apply various normalization functions here
// e.g. force lowercase, handle special characters, etc
var wordArray = allText.split(' ');
// loop to count word frequencies & store in object, found online
var wordCounter = {};
for (var i = 0; i < wordArray.length; i++) {
if (wordCounter[wordArray[i]]) {
wordCounter[wordArray[i]] += 1;
} else {
wordCounter[wordArray[i]] = 1;
}
};
// function to sort by word frequency, also found
var wordArraySortFunction = function(word1, word2){
if(wordCounter[word1] < wordCounter[word2]){
return -1;
}else if(wordCounter[word1] == wordCounter[word2]){
return 0;
}else if(wordCounter[word1] > wordCounter[word2]){
return 1;
}
};
// apply sort & reverse to get descending order
wordArray.sort(wordArraySortFunction).reverse();
// create array of strings for each word & its frquency
var freqTable = [];
// return unqiue word frequencies each on own line.
// uses verbose Array ...Set syntax to deduplicate, can prob improve
return Array.from(new Set(freqTable)).join('\n')```%>
## 📋 Describe the SmartBlock
<!-- Short and concise description of how the SmartBlock works and its purpose -->
Proof of concept to get full text from all blocks on a page, then produce simple term frequency counts. Partly inspired by Tiago's question during he & Connor's [Peace Summit](https://youtu.be/-Aqg9Z5gWNg?t=1231). Meant as initial foray into further NLP applications, likely by importing a [JS NLP library](https://www.kommunicate.io/blog/nlp-libraries-node-javascript/), as demonstrated here: https://github.com/roamhacker/SmartBlocks/issues/127. Probably won't have much time myself so happy for others to run with it!
Obvious refinements:
1. Preprocessing / cleaning / stemming / remove stop words
2. Only return top N words or counts > N
3. Enable ngrams as opposed only single words (unigrams)
4. Provide page or block reference to process, rather than current page(s) / blocks in view
5. Maybe option to exclude linked / unlinked references
6. Return as actual table?
7. Import [JS NLP library](https://www.kommunicate.io/blog/nlp-libraries-node-javascript/) for deeper functionality
## ✅ Describe any prerequisites or dependencies that are required for this SmartBlock
<!-- List any required roam/js extensions, roam/css, other SmartBlocks etc. -->
Just Roam42
## 📷 Screenshot of your #42SmartBlock workflow/template from Roam
<!-- To ensure other users setup correctly, please provide a screenshot of your #42SmartBlock in Roam -->
<img width="383" alt="Screen Shot 2020-12-27 at 5 56 34 PM" src="https://user-images.githubusercontent.com/18430230/103181218-f17b7580-486c-11eb-8ef5-d0fe2166d64b.png">
## 💡 Additional Info
<!-- Add any other context, info, or screenshots/GIFs to help other users with this SmartBlock -->
![ezgif com-video-to-gif](https://user-images.githubusercontent.com/18430230/103179752-2da6da00-485d-11eb-80c9-96c7934c3288.gif)
<img width="593" alt="Screen Shot 2020-12-27 at 4 00 45 PM" src="https://user-images.githubusercontent.com/18430230/103179733-f0dae300-485c-11eb-99cf-a934b9770ce7.png">
✂️ Copy of your #42SmartBlock from Roam
#42SmartBlock word frequencies
// get all blocks on current page by div class. can probably be refined var allBlocks = document.querySelectorAll('.rm-blockinput.rm-blockinput--view.roam-block.dont-unfocus-block.hoverparent.rm-block-text');
// turn list into array in order to use .map() next (i think?) var blockArray = [].slice.call(allBlocks);
// collapse innerText from all block divs into single string var allText = blockArray.map(function(e){ return e.innerText; }).join(' ');
// split each word from string, creating array of words, // could also apply various normalization functions here // e.g. force lowercase, handle special characters, etc var wordArray = allText.split(' ');
// loop to count word frequencies & store in object, found online var wordCounter = {};
for (var i = 0; i < wordArray.length; i++) { if (wordCounter[wordArray[i]]) { wordCounter[wordArray[i]] += 1; } else { wordCounter[wordArray[i]] = 1; } };
// function to sort by word frequency, also found var wordArraySortFunction = function(word1, word2){ if(wordCounter[word1] < wordCounter[word2]){ return -1; }else if(wordCounter[word1] == wordCounter[word2]){ return 0; }else if(wordCounter[word1] > wordCounter[word2]){ return 1; } };
// apply sort & reverse to get descending order wordArray.sort(wordArraySortFunction).reverse();
// create array of strings for each word & its frquency var freqTable = [];
for(var i=0; i<wordArray.length; i++){ freqTable[i] = wordArray[i] + ': ' + wordCounter[wordArray[i]];
};
// return unqiue word frequencies each on own line. // uses verbose Array ...Set syntax to deduplicate, can prob improve return Array.from(new Set(freqTable)).join('\n')```%>