Improve gender detection

psfinaki / CheckYourCzech

The service to practice Czech grammar.

https://check-your-czech.azurewebsites.net

GNU General Public License v3.0

19 stars 1 forks source link

Improve gender detection #190

Closed psfinaki closed 6 years ago

psfinaki commented 6 years ago

Current way to detect the gender of the noun is the following: we look for the first paragraph in the HTML page containing word "rod ", then it is assumed this paragraph actually specifies the gender, and then we try to interpret its content as some of the known genders.

While this nearly always works, there might be cases another paragraph with "rod " gets on the way. An example is epizoda. So a more bulletproof algorithm should be eventually created.

psfinaki commented 6 years ago

This was solved by moving to a built-in framework Article. Gender is looked up only in the noun part of the wiki article.