Hyphenator optimizations and patterns for the German language

mfietz commented 9 years ago

I checked that the current implementation and my optimized one yield the same results by running both against the 235.886 words from /usr/share/dict/words

In non-representative benchmarks on a local JVM, my code was around 4 times faster than the current implementation. Memory consumption should also be lower.

mathew-kurian commented 9 years ago

Thanks! But before I merge, can you do a comparison of the load times comparing the existing solution with yours (presuming the "4x" is actual hyphenation)? What about memory consumption? In general, it would be great if you explain your changes a bit more in detail?

mfietz commented 9 years ago

Can you do a comparison of the load times comparing the existing solution with yours

Of course. But how exactly would I do that - via HyphenatedTest? Besides, I would be pretty astonished if the new code used more memory. I only removed data structures or replaced them with ones that have a smaller memory footprint.

presuming the "4x" is actual hyphenation

I measured the time for loading the hyphenator and hyphenation. So I did not distinguish between the two steps. (I can post the code in a gust if you like. It is not very sophisticated, but it should prove my point)

In general, it would be great if you explain your changes a bit more in detail?

The algorithm is completely the same. The difference is: I use `ìnt[]instead ofList`` where an array suffices, for example. There were also several other data structures that were completely unnecessary. If you have any particular questions to code changes that are unclear, I'm happy to explain them.

mathew-kurian commented 9 years ago

Thank you for the explanation. Merged.

mathew-kurian / TextJustify-Android

Hyphenator optimizations and patterns for the German language #97