maTayefi / spellbook-dictionary

Automatically exported from code.google.com/p/spellbook-dictionary
0 stars 0 forks source link

Improve spellchecker startup time and memory consumption #50

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The spellchecker starts slowly and consumes a lot of memory. We should be 
able to run Spellbook with heap size of about 40-50MB at the most. To 
optimize the startup time I'd recommend loading the rank entries at the 
background. Maybe you could also make some use of jide overlays - showing 
some busy icon in the text pane while the rank entries are loading - have a 
look here for examples. Btw center the spellchecker over the main spellbook 
frame.

Original issue reported on code.google.com by lord...@gmail.com on 17 May 2010 at 8:45

GoogleCodeExporter commented 8 years ago
It loads the words in background and has nice overlay now. Regarding the memory
consumption on my machine a fresh start gets a 100MB (at least that's what the 
memory
button says)

Original comment by iivalchev@gmail.com on 17 May 2010 at 5:47

GoogleCodeExporter commented 8 years ago
Without setting a heap limit - this is normal. If you set the limit(Xmx) to say 
40M, 
Spellbook starts using about 25, but when I start the spellchecker I get 
immediately 
out of memory error. I personally don't care that much about the memory 
footprint 
since I have a lot of RAM, but most users do :-) So it would be nice if the 
memory 
usage really gets optimized...

Original comment by lord...@gmail.com on 17 May 2010 at 7:42

GoogleCodeExporter commented 8 years ago
Keep in mind that the version of Spellbook that we ship imposes a 40M heap size 
limit 
and basically the spellchecker does not work for anyone. I've already got 
several 
complaints from users about the problem, but I'm not willing to increase the 
heap size 
to 100M, only to justify the spellchecker's huge memory requirements... Maybe a 
more 
efficient algorithm should be sought? After all Norvig's is not production 
grade, but 
more like toy grade.

Original comment by lord...@gmail.com on 22 May 2010 at 4:22

GoogleCodeExporter commented 8 years ago
I am just looking at Jazzy, and it seems to have some implementation that 
doesn't
cache any words in memory. It uses aspell, but the project seems to be dead the 
last
release dates 2005.

Original comment by iivalchev@gmail.com on 23 May 2010 at 2:37

GoogleCodeExporter commented 8 years ago
I'd suggest you to investigate http://hunspell.sourceforge.net/ instead. It 
mentions 
some java ports/interfaces and it's the de facto standard spell checker... In 
the mean 
time please add some check - it there is not enough heap memory to start the 
spell 
checker - issue a warning for the user and don't start it at all.

Original comment by lord...@gmail.com on 24 May 2010 at 1:07

GoogleCodeExporter commented 8 years ago
I've just made it with hunspell, but it's a little ugly cause I've added the 
jars as
system scoped dependencies in the pom. Is there another way to add jars in the
classpath with maven. And what path should be used when accessing files in the
resources dir.

Original comment by iivalchev@gmail.com on 26 May 2010 at 2:47

GoogleCodeExporter commented 8 years ago
I'll have a look at your fix, we might have to create our own maven repo 
somewhere to 
host artifacts such as this. The other ways is to copy them by hand to your 
local 
repository and use them like regular dependencies. You'll have to create some 
pseudo 
pom files for them, though. 
As for the resources  - if something is in resources/some/thing, the path to if 
for 
say getResourceAsStream would be "/some/thing". Btw how much is the reduction 
of the 
memory footprint from using hunspell? 

Original comment by lord...@gmail.com on 26 May 2010 at 3:55

GoogleCodeExporter commented 8 years ago
Btw have a look here for the local file based repo - 
http://stackoverflow.com/questions/2229757/maven-add-a-dependency-to-a-jar-by-
relative-path/2230464#2230464

Original comment by lord...@gmail.com on 27 May 2010 at 5:33

GoogleCodeExporter commented 8 years ago
About the memory footprint, works perfectly in 40m heap limit and does not have 
any
observable memory consumption. Please check whether everything is fine with 
added
repository.

Original comment by iivalchev@gmail.com on 28 May 2010 at 3:58

GoogleCodeExporter commented 8 years ago
I've tested it and it's ok. However I moved the repo up the project hierarchy 
and 
renamed it. I also made it accessible via http directly from our svn repo, 
since it's 
a bit more standard in this manner. We should put all external artifacts not 
available 
in repos already there. I consider this issue finally resolved :-)

Original comment by lord...@gmail.com on 28 May 2010 at 9:03