AceCentre / pasco

Phrase Auditory Scanning COmmunicator - AAC App for iOS and the Web
https://app.pasco.chat
GNU General Public License v3.0
14 stars 6 forks source link

Improve word prediction for a range of languages #251

Open willwade opened 3 years ago

willwade commented 3 years ago

We should improve our prediction engine across multiple languages

General thoughts

Although we should look to use #252 - we currently have our own lookup engine. But we don't have many languages supported.

This tool https://github.com/LuminosoInsight/wordfreq - may help us to generate a top_n_list - which would help a lot..

` from wordfreq import top_n_list

top_n_list('en', 10) `

willwade commented 2 years ago

For the Hebrew frequency list I threw together this horrible sight:

from wordfreq import top_n_list
i = 1002
for item in top_n_list('he', 1000, wordlist='best'):
    print("  {")
    print('    "v": "'+item+'",')
    print('    "w": "'+str(i)+'",')
    print("  },")
    i = i-1