voikko / corevoikko

Libvoikko and essential linguistic resources
Other
89 stars 25 forks source link

NDK Build for Android #50

Closed Mukhammadsaid19 closed 2 years ago

Mukhammadsaid19 commented 2 years ago

Hello, dear developers! My name is Mukhammadsaid, I am a third-year student from Inha University in Tashkent, Uzbekistan.

I've been developing a morphological analyzer and spellchecker for Uzbek, a second most spoken agglutinative Turkic language. My spellchecker is fairly ready to deploy, it's in .zhfst format, so I would like to use it on Android using your library. Your project is very helpful to me (In fact, I have been using Finnish resources all the time -- I learned all about FSTs and finite-state algebra from Finnish resources).

I looked up droidvoikko implementation, but it is old and doesn't take into account updates in the core repo. Also, I'm a little bit of beginner to Android NDK system and JNI overall, so could you provide any hints to make it work in it? Currently, Android supports CMake as their native build tool, but I'm still stuck with it. I would appreciate any help! Thank you!

hatapitk commented 2 years ago

Hi!

Unfortunately I cannot help you much with Android development. I developed droidvoikko as an experiment with Android programming but stopped working on it when I found out that Samsung (the most popular Android manufacturer among Finnish users at that time) had disabled the possibility to use custom spell checkers in their phones. I don't know whether things have changed during the past 10 years. Apart from droidvoikko I never done low level Android development so my knowledge on that subject is totally outdated, sorry.

If you decide to try to update droidvoikko to work with latest versions of Android I believe you can use the latest corevoikko repository with it. The native library interface has only had backwards compatible changes so the old JNI code in droidvoikko will probably still work. But other than that I don't know what has changed on the Android side.

Mukhammadsaid19 commented 2 years ago

Thank you for your immediate response! I was able to build voikko and port it to Android. However, I encountered one issue. The same input or same-prefixed inputs are not analyzed and corrected second time (no correction array). Is it done on purpose? If yes, how to make use of it, or disable? I assume it's the feature of ospell, because it also behaves in the same way. I couldn't find about it in the documentation. Thanks in advance.

hatapitk commented 2 years ago

That sounds like a bug. Or perhaps an incompatible change in hfst-ospell. Can you share a zhfst dictionary and give some examples of expected vs. actual results you are seeing? I can try to reproduce this and find out if it can be fixed in libvoikko.

Mukhammadsaid19 commented 2 years ago

I used English speller-en.zhfst from the sources of hfst here.

image

I have entered the wrong word 'boyy' and got the corrections. However, on the second try, there are no corrections. Subsequent attempts to get corrections for prefixes of my word are labeled as wrong, but without any corrections.

I have checked it with the latest release of hfst-ospell itself. It also behaves like this way, so the problem is probably with hfst-ospell.

P.S. What version of hfst-ospell is stable compatible with voikko?

hatapitk commented 2 years ago

hfst-ospell 0.5.0 is the latest version I have tested myself (Finnish dictionaries do not use it so I rarely test it myself). But in theory all newer releases should work. Anyway I took a look and there is a very recent change made in hfst-ospell just a few days ago that "disables caching":

https://github.com/hfst/hfst-ospell/commit/a8d77894df14fadb9d68c3b1c090661f824d50f8

Could that be related? You could try building the code from hfst-ospell master to see if that fixes anything.

Mukhammadsaid19 commented 2 years ago

Yes, it fixed! That's interesting. Why would they disable caching? That's strange.