sur-pavel / IrbisSearchEngine

0 stars 0 forks source link

Create MORH database filler #1

Open sur-pavel opened 3 years ago

sur-pavel commented 3 years ago

Create project for filling MORH database:

  1. Read all keywords from all databases.
  2. Sort it with Russian stem on regular and non regular.
  3. Sort non regular on specific keywords like Church Slavonic and misprints.
  4. Correct misprints and add them to regular keywords list.
  5. Add regular keywords using mystem.
  6. Add non regular keywords using mystem.
sur-pavel commented 3 years ago

ReadAllTerms can get from: https://pastebin.com/p7KwrwsQ

sur-pavel commented 3 years ago

Slavonic filter: if Russian stem>2 not stem, or mystem