When a user typo's a numeric field in their query:
File "C:\Users\Chris\AppData\Local\Python\venv\document-search\Lib\site-packages\whoosh\searching.py", line 931, in correct_query
return sqc.correct_query(q, qstring)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Chris\AppData\Local\Python\venv\document-search\Lib\site-packages\whoosh\spelling.py", line 327, in correct_query
sugs = c.suggest(token.text, prefix=prefix, maxdist=maxdist)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Chris\AppData\Local\Python\venv\document-search\Lib\site-packages\whoosh\spelling.py", line 66, in suggest
for item in _suggestions(text, maxdist, prefix):
File "C:\Users\Chris\AppData\Local\Python\venv\document-search\Lib\site-packages\whoosh\spelling.py", line 111, in _suggestions
for sug in reader.terms_within(sugfield, text, maxdist, prefix=prefix):
File "C:\Users\Chris\AppData\Local\Python\venv\document-search\Lib\site-packages\whoosh\codec\base.py", line 364, in find_matches
match = dfa.next_valid_string(term)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Chris\AppData\Local\Python\venv\document-search\Lib\site-packages\whoosh\automata\fsa.py", line 267, in next_valid_string
for i, label in enumerate(string):
^^^^^^^^^^^^^^^^^
TypeError: 'int' object is not iterable
# Fill in default corrector objects for fields that don't have a custom
# one in the "correctors" dictionary
from whoosh.fields import TEXT # <-----
for fieldname, field in self.schema.items(): # <-----
fieldname = aliases.get(fieldname, fieldname)
if isinstance(field, TEXT) and fieldname not in correctors: # <-----
correctors[fieldname] = self.reader().corrector(fieldname)
Anyway that's the fix I'm using for now. I only need corrections for text fields.
When a user typo's a numeric field in their query:
If we look here:
https://github.com/mchaput/whoosh/blob/d9a3fa2a4905e7326c9623c89e6395713c189161/src/whoosh/searching.py#L914
This doesn't seem right. We're using
ReaderCorrector
as the default for all fields?SegmentReader.terms_within()
usesAutomata.terms_within()
which is Levenshtein distance: https://github.com/mchaput/whoosh/blob/d9a3fa2a4905e7326c9623c89e6395713c189161/src/whoosh/codec/base.py#L376 and chokes on the int correctly returned fromNUMERIC.from_bytes()
received withinW3FieldCursor
: https://github.com/mchaput/whoosh/blob/d9a3fa2a4905e7326c9623c89e6395713c189161/src/whoosh/codec/whoosh3.py#L541Pretty sure Levenshtein distance isn't meant to be supported on NUMERIC fields? Since they're stored like https://github.com/mchaput/whoosh/blob/d9a3fa2a4905e7326c9623c89e6395713c189161/src/whoosh/fields.py#L712 though maybe I'm the dumb one.
Shouldn't
searching.py
be something more like:Anyway that's the fix I'm using for now. I only need corrections for text fields.