yeraydiazdiaz / lunr.py

A Python implementation of Lunr.js 🌖
http://lunr.readthedocs.io
MIT License
188 stars 16 forks source link

Index can not be created with only one field #97

Closed aaad closed 3 years ago

aaad commented 3 years ago

Thank you for this nice project, this really helps me.

I have the following problem: The index creation idx = lunr(...) throws the following exception when only specifying one field (fields=('title')).

Traceback (most recent call last):
  File "test.py", line 19, in <module>
    idx = lunr(
  File "/home/user/repos/recsys-fairness/.env/lib/python3.8/site-packages/lunr/__main__.py", line 40, in lunr
    builder.add(document)
  File "/home/user/repos/recsys-fairness/.env/lib/python3.8/site-packages/lunr/builder.py", line 145, in add
    field_value = doc[field_name] if extractor is None else extractor(doc)
KeyError: 't'

If i specify two fields as field parameter (fields=('title', 'title')) then the code works.

The problem can be reproduced by using this code:

#!/usr/bin/python3
import logging
logging.basicConfig(level=logging.DEBUG)
from lunr import lunr

documents = [
    {
        'id': 'a',
        'title': 'test test'
    },
    {
        'id': 'b',
        'title': 'abc abc'
    }
]

logging.info('Index creation...')

idx = lunr(
     ref='id', fields=('title'), documents=documents
)

idx.search('test')
yeraydiazdiaz commented 3 years ago

Hi @aaad, the issue is that you're missing a comma in your tuple, i.e. ('title') vs ('title',). Python does not consider brackets a tuple unless they contain a comma, so it evaluates it as the string title. The solution is to use either fields=('title',) or fields=['title'].

Hope that helps.