ilo-token / ilo-token.github.io

A rule-based Toki Pona to English translator; Currently work in progress!
https://ilo-token.github.io/
MIT License
7 stars 0 forks source link

Scrape wiktionary #40

Closed neverRare closed 2 months ago

neverRare commented 3 months ago

Currently, we manually encode conjugations and other properties in the dictionary. Perhaps we could instead make this semi automatic by scraping wiktionary to fetch conjugations and other properties.

This solution is becoming attractive because of complicated conjugations the verbs has (#39).

We still need to encode other properties such as what kind of adjective it is for adjective ordering (#17).

Potential additional use: scraping pronunciation to find out whether it starts with vowel or consonant sound to find out whether to use article "a" or "an".

neverRare commented 2 months ago

thanks to jan Kita, we don't need to scrape wiktionary. instead we'll simply use https://github.com/spencermountain/compromise