General discussion - Githubissues

sailfish-keyboard / sailfishos-presage-predictor

Presage based input predictor for the Sailfish OS

https://openrepos.net/content/sailfishkeyboard/maliit-plugin-presage

GNU General Public License v3.0

7 stars 4 forks source link

General discussion #16

Closed rinigus closed 6 years ago

rinigus commented 6 years ago

I am opening this issue for general discussion and updates/notes. Please feel free to close it at any time or move it to some other channel, as it is the best. Right now, we have been using 'English keyboard' issue for such discussion, maybe its better to make something more dedicated.

As a heads up, after build environment adjustments, I have updated presage library at https://github.com/rinigus/presage to

make double conversion from string using C locale. This is done by specifying locale only for conversion step
add empty database and configure presage to use it on SFOS install. This allows to start the predictor with the default settings and later change the database in accordance with the used language. The solution used earlier was causing crashes on desktop.

I have worked a bit on improving the performance, so far about 20% gain. I have plans on how to improve it further by using other database format, let's see if it will work.

I will read now the plugin code and will make probably some small adjustments. Will submit PR when ready.

martonmiklos commented 6 years ago

Well it is turned out to be super easy: https://help.github.com/articles/transferring-a-repository-owned-by-your-personal-account/#transferring-to-an-organization

rinigus commented 6 years ago

Awesome! I will transfer tonight as well. Thank you for getting it organized!!!

rinigus commented 6 years ago

That was simple indeed! Thanks for setting it up, persage is transferred as well.

ljo commented 6 years ago

@rinigus I managed to install the hunspell-enabled version Saturday lunch just after they bacame available in obs, but then got some kind of flu. I have been typing and experimenting since then though and it works almost perfect in creating suggestions. Especially after you get used to the changing suggestions even in first and second position. @martonmiklos great that you got the organisation in place. I will try to read up on the comments some more.

rinigus commented 6 years ago

@ljo - great to hear that hunspell is working :) . @martonmiklos, any success with starting the tests?

I will look into cleaning up English n-gram database. It has rather prominent use of swear words that would be good to filter out. I would expect it to be done tonight.

PR acceptance and the release plans? I would vote for releasing as it is with hunspell and get the ball rolling with getting more languages/users in.

martonmiklos commented 6 years ago

any success with starting the tests?

Not yet, I have not had a time to install it yet.

PR acceptance and the release plans? I would vote for releasing as it is with hunspell and get the ball rolling with getting more languages/users in.

Let's aim for PR review on this weekend, and release on the next weekend with mariadb+async+hunspell. I will also try to find some better corpus for Hungarian until that, but it is just a bonus.

rinigus commented 6 years ago

Let's aim for PR review on this weekend, and release on the next weekend with mariadb+async+hunspell. I will also try to find some better corpus for Hungarian until that, but it is just a bonus.

Thanks, then we have a timeline :)

rinigus commented 6 years ago

@ljo: what was the English corpus that you used? I used OpenSubtitles and OANC and, as a result of the subtitles, get many profanity hits in n-grams. So, I wonder whether you cleaned your corpus somehow or maybe used something different?

ljo commented 6 years ago

@rinigus CurrentIy in Helsinki until Friday evening. I used a selection from our ASPAC English corpus combined with some small texts with more contractions. Should I redo it now when we can use a larger n-gram database?

rinigus commented 6 years ago

@ljo, no problem, I think I found a reasonable filter and will test that. I'll try to generate the database using such corpus. I'll ask for help if I'll get into trouble.

rinigus commented 6 years ago

@ljo, I think I managed to cleanup English dictionary, as much as I tested. I'll package the scripts and will push it to presage under utils, probably tonight.

I have seen that you packaged Swedish keyboards in OpenRepos - great! I would suggest to add also the corresponding QML files and make the names of the files match, as done for English and Estonian in my OBS. While plane copy of the ones provided by arrowboard (in your case), this will allow selecting only Presage keyboards in Settings and will behave well on switching languages.

martonmiklos commented 6 years ago

Hah! Github has a feature called teams so I have created one: https://github.com/orgs/sailfish-keyboard/teams/development-team/discussions

It might makes sense to continue the general/non issue related (for e.g. releasing related) discussions there :)

rinigus commented 6 years ago

Excellent, should we close this issue then :)

martonmiklos commented 6 years ago

Discussion will be continued here:

https://github.com/orgs/sailfish-keyboard/teams/development-team/discussions/1