mchaput / whoosh

Pure-Python full-text search library
Other
580 stars 73 forks source link

Unable to use character folding #41

Open richin13 opened 1 year ago

richin13 commented 1 year ago

Description

First of all thanks for maintaining this useful library 👍🏼

I was trying to follow the docs on character folding but so far I've been unable to get it to work. The error I'm getting is this:

Traceback (most recent call last):
  File "/home/ricardo/src/sandbox/padron-parser/repro.py", line 12, in <module>
    writer.add_document(name=u"René Descartes")
  File "/home/ricardo/src/sandbox/padron-parser/.venv/lib/python3.11/site-packages/whoosh/writing.py", line 750, in add_document
    for tbytes, freq, weight, vbytes in items:
  File "/home/ricardo/src/sandbox/padron-parser/.venv/lib/python3.11/site-packages/whoosh/fields.py", line 164, in index
    for tstring, freq, wt, vbytes in word_values(value, ana, **kwargs):
  File "/home/ricardo/src/sandbox/padron-parser/.venv/lib/python3.11/site-packages/whoosh/formats.py", line 223, in word_values
    for t in tokens(value, analyzer, kwargs):
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ricardo/src/sandbox/padron-parser/.venv/lib/python3.11/site-packages/whoosh/formats.py", line 125, in tokens
    gen = analyzer(value, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: CharsetFilter.__call__() got an unexpected keyword argument 'mode'

Here's the minimal reproducible example

from whoosh import analysis, fields, index
from whoosh.support.charset import accent_map

analyzer = analysis.CharsetFilter(accent_map)
index_path = "my_index"
schema = fields.Schema(
    name=fields.TEXT(analyzer=analyzer, stored=True),
)

ix = index.create_in(index_path, schema)
writer = ix.writer()
writer.add_document(name=u"René Descartes")
writer.add_document(name=u"Ñame Frito")
writer.commit()

I tried manually removing the mode item from the kwargs dict being passed in formats.py:125 but then got a similar error this time with KeyError: positions

Env details