Open njr2128 opened 3 years ago
To export data and clean it up:
E.g., wom*
(to capture woman, women, etc.)
Hover over and click on the "export" button that only appears when hovering (cannot capture with screenshot)
Do a find+replace with ctrl+f
and choosing regex (see arrow)
Use the expression ^\s[0-9]+
to only find the numbers at the beginning of the line (applied by voyant) so that if there are any numbers in the actual corpus itself they are not also removed.
^
is beg of line
\s
is spaces
create .md from last comment and mount as mini tutorial in sandbox
And also include embeddable Voyant tools
Created a dataset without "amp" (cleaned up the holdovers from markup &
)
women-vocabulary_no-amp.txt
New wordcloud:
From Terry:
Two wordclouds generated from voyant correlation:
Exported that data, cleaned it up and then used an external wordcloud generator to create these:
Women
Horse