yeraydiazdiaz / lunr.py

A Python implementation of Lunr.js 🌖
http://lunr.readthedocs.io
MIT License
188 stars 16 forks source link

Feature request: Make pipeline configurable #75

Closed wilhelmer closed 3 years ago

wilhelmer commented 4 years ago

Right now, there seems to be no way to configure the lunr pipeline, e.g., disable or enable the following pipeline functions:

This would be useful, e.g., for MkDocs themes that have a different search implementation. For example, MkDocs Material doesn't use stemmer.

By default, lunr.py seems to use all 3 functions for the indexing pipeline and stemmer for the searching pipeline (not sure what the difference is).

https://github.com/yeraydiazdiaz/lunr.py/blob/0634322ec5855388047a2bef1bd9e8479a2d0e60/lunr/__main__.py#L33-L50

https://github.com/yeraydiazdiaz/lunr.py/blob/0634322ec5855388047a2bef1bd9e8479a2d0e60/lunr/languages/__init__.py#L84-L87

Maybe you could add a pipeline parameter to lunr?

idx = lunr(
    ref='id', fields=('title', 'body'), documents=documents, pipeline=('trimmer','stop_word_filter')
)
yeraydiazdiaz commented 4 years ago

Indeed, there is no easy way to do this atm.

This issue and https://github.com/yeraydiazdiaz/lunr.py/issues/77 are related to Lunr customisation which I haven't had time to implement just yet.

A work around is to use the Builder class manually following what the lunr function does which is not particularly complicated.

wilhelmer commented 4 years ago

Yep, that's what I did in my fork of this project. I simply removed stemmer from pipeline.add(). But I'd rather have a universal solution in the main repo.

yeraydiazdiaz commented 3 years ago

Lunr 0.6.0 has been released including support for configurable pipelines.

Give it a try see if it works for you 🙂