Closed petrifiedvoices closed 2 years ago
I have an R script cleaning the text of an inscription to the text-mining friendly version.
It could be added feature for the scraper, ideally leaving the original raw text in one column producing the next clean version in a new column.
R cleaning is> https://github.com/sdam-au/EDCS_ETL/blob/master/scripts/1_2_r_EDCS_cleaning_text.Rmd
Build functions in the correct sequence, producing the Conservative and interpretive version of the text
Wrote and passed Unit tests for the most important cleaning features, for both conservative and interpretive cleaning functions
I have an R script cleaning the text of an inscription to the text-mining friendly version.
It could be added feature for the scraper, ideally leaving the original raw text in one column producing the next clean version in a new column.
R cleaning is> https://github.com/sdam-au/EDCS_ETL/blob/master/scripts/1_2_r_EDCS_cleaning_text.Rmd
Build functions in the correct sequence, producing the Conservative and interpretive version of the text