Techiemouse / FinalYearProject

Analysis of the digital historical newspaper collection at the National Library of Wales.
1 stars 0 forks source link

Splitting article text into words #1

Open Techiemouse opened 10 years ago

Techiemouse commented 10 years ago

Problem with getWordList - doesn't work with splitting by white space and using \W creates some fields in the word list that are not actual words. Need to look into cleaning text.

Techiemouse commented 10 years ago

might not be needed for now