kiwix / libkiwix

Common code base for all Kiwix ports
https://download.kiwix.org/release/libkiwix/
GNU General Public License v3.0
118 stars 55 forks source link

kiwix-serve raises 500 error on `Or` search term #1104

Open benoit74 opened 1 month ago

benoit74 commented 1 month ago

See e.g. https://library.kiwix.org/catalog/v2/entries?start=0&count=4&lang=fra&category=other&q=Or

veloman-yunkan commented 2 weeks ago

The problem here is that the search text is interpreted as a Xapian query where or is an operator, and therefore the query is syntactically invalid. Note that if you search for "composition or rose" (without quotes) it will be interpreted as a query for either composition or rose rather than a query for all of composition, or and rose.

There are different ways to address this issue:

  1. Preprocess the search text in the front end so that words coinciding with Xapian query operators are somehow escaped (e.g. quoted or preceded with a +). The backend and the semantics of the q parameter of the /catalog/v2/entries API endpoint remain unchanged (thus it will be possible to pass advanced Xapian queries via the HTTP API).
  2. Redefine the syntax and semantics of the q parameter of the /catalog/v2/entries API endpoint and parse it respectively in the backend. The content of the searchbox is passed to the endpoint as is (like in the current implementation).
  3. Document the current implementation and fix it so that a proper error is displayed to the user. The users will have to escape their queries on their own.
veloman-yunkan commented 1 week ago

@kelson42 @rgaudin ping

kelson42 commented 1 week ago

This issue remember me of https://github.com/kiwix/kiwix-tools/issues/440, I need to make a reassessment of all of this.

kelson42 commented 1 week ago

@veloman-yunkan Solution (1) - so a kind of xapian_escaping() should be implemented, but this should be done IMHO:

For Kiwix serve, not sure exactly how it should be done... but I guess this problem is potentialy everywhere