r-bishop / bpbible

Automatically exported from code.google.com/p/bpbible
0 stars 0 forks source link

module indexing fails with pystemmer installed (for Russian bibles) #145

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Whenever I try to index a Russian bible (like this -- 
http://www.crosswire.org/sword/modules/ModInfo.jsp?modName=RST), I get this 
error:

Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/wx-2.8-gtk2-unicode/wx/_core.py", 
line 14614, in <lambda>
    lambda event: event.callable(*event.args, **event.kw) )
  File "/home/andreas/Apps/bpbible-0.4.5-r978/search/searchpanel.py", line 
248, in on_show
    self.check_for_index()
  File "/home/andreas/Apps/bpbible-0.4.5-r978/search/searchpanel.py", line 
372, in check_for_index
    self.build_index(self.version)
  File "/home/andreas/Apps/bpbible-0.4.5-r978/search/searchpanel.py", line 
971, in build_index
    self.index = self.index_type(version, callback)
  File "/home/andreas/Apps/bpbible-0.4.5-r978/search/index.py", line 113, 
in __init__
    self.init(progress)
  File "/home/andreas/Apps/bpbible-0.4.5-r978/search/index.py", line 125, 
in init
    self.GenerateIndex(self.version, progress)
  File "/home/andreas/Apps/bpbible-0.4.5-r978/search/index.py", line 197, 
in GenerateIndex
    self.GatherStatistics()
  File "/home/andreas/Apps/bpbible-0.4.5-r978/search/index.py", line 242, 
in GatherStatistics
    print word
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: 
ordinal not in range(128)

Same happens with another Russian bible -- 
http://www.crosswire.org/sword/modules/ModInfo.jsp?modName=RusVZh

I can uninstall pystemmer, do the indexing, then install pystemmer again. 
Of course, then the stemming doesn't work. Pystemmer seems to support 
Russian, no? http://pypi.python.org/pypi/PyStemmer/1.0.1

What version of the product are you using? On what operating system?
Linux, r978

Original issue reported on code.google.com by war...@gmail.com on 14 Apr 2010 at 6:47

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r979.

Original comment by benpmor...@gmail.com on 14 Apr 2010 at 1:21

GoogleCodeExporter commented 9 years ago

Original comment by jonmmor...@gmail.com on 22 Apr 2010 at 12:52

GoogleCodeExporter commented 9 years ago

Original comment by jonmmor...@gmail.com on 8 May 2010 at 2:13