SvenElyes / Textanalytics

MIT License
1 stars 0 forks source link

General/Keywords - Preprocessing - Bible Stopword List #14

Open aileen-reichelt opened 3 years ago

aileen-reichelt commented 3 years ago

While doing the keyword extraction, I noticed that some words were not caught be the standard stopword list provided by the YAKE implementation due to the old language of the bible. Examples: thee, thy, saith, etc. I named this issue "General/Keywords" because this might also apply to other tasks.

Proposed solution: We search for a stopword list created specifically for old English and add those words to the YAKE stopword list.