A web app (wordsworth.us) to identify anachronistic words & phrases in historical fiction by comparing it to fiction written during that era. Hackbright Fellowship final project.
MIT License
5
stars
0
forks
source link
Add a regex to deal with ellipses of three periods #8
Regex currently does not take into account ellipses formed by three periods (...) instead of ellipsis character(…). It treats them as 3 individual periods and removes them, and if no space was included before/after the ellipsis, ends up jamming two words together. Hence the existence of words like "sympathyquickness" in the 1900s word set (because the phrase "...full of innate sympathy...quickness to perceive good" is in Room With a View). Need to rewrite regex, test & re-pickle.
Completed evening of 6/22. Because remove_irrelevant_characters() is also run on user input, this also eliminates the problem if user's text includes a 3-period ellipsis.
Regex currently does not take into account ellipses formed by three periods (...) instead of ellipsis character(…). It treats them as 3 individual periods and removes them, and if no space was included before/after the ellipsis, ends up jamming two words together. Hence the existence of words like "sympathyquickness" in the 1900s word set (because the phrase "...full of innate sympathy...quickness to perceive good" is in Room With a View). Need to rewrite regex, test & re-pickle.