libscie / now-boarding

Onboarding module for new researchers.

Other

8 stars 0 forks source link

Automatically extract links between topics #10

Closed chartgerink closed 6 years ago

chartgerink commented 6 years ago

Topics start accumulating cross references that show their interrelatedness. These associations can be used to create a visualization of the topics and how they are interconnected, and what topics relate to each other.

It would be useful to create a script that automatically goes through these to identify them and put them into a machine-readable list. Currently these references occur in all content/*.md files, and are the links that refer to other .md files (so a regex could be \[*\]$*.md$ to extract these links).

Need to think about this, and discussion/PRs/tips welcome!

chartgerink commented 6 years ago

Okay, started working on this and optimized the regex grep -E "(\w+-?)+.md" $FILE. First commit of this in bd03ab0c83526fcf6b1f97da0dc05752ee33b248 --- this script creates tidy data in links.csv but can only be figured out exactly how to output this when we know how the visualisation is created in #2

./scripts/associated-topics.sh .md content/*.md

To-do

[x] Update (if necessary) to restructure data output for visualisation
[x] Add script to run in last echo call

chartgerink commented 6 years ago

Seems like I have to take these steps given the progress mentioned in #2

[x] collect all topics in content/*.$EXT
[ ] Prettify topic strings to push (no extension, capitalized)
[x] push all topics to assets/graphFile.json its nodes object
[x] make template link object
[x] For each content/*.$EXT file
- [x] set source index for topic
- [x] find all references to other topics in topic
- [x] identify target index of referenced topic
- [x] push all links into assets/graphFile.json

chartgerink commented 6 years ago

NodeJS kept stalling on me and making my computer fan at massive RPM. Updated the regex to (\w-?)+.md and that works 👼

chartgerink commented 6 years ago

Added hyperlinking to the relevant files in 8af6badf6fa7f72ed2f5a3f39f6e86f4ca1d2196 (still refers to .md files, should be updated to .md.html in combination with work on generate-pages branch)

chartgerink commented 6 years ago

Completed and on master.