apache / incubator-ponymail-foal

Apache Pony Mail Foal (Next Generation Suite)
https://ponymail.apache.org
Apache License 2.0
24 stars 14 forks source link

Various issues with special characters in searches #109

Open sebbASF opened 3 years ago

sebbASF commented 3 years ago

The following characters all cause errors:

(, ), ", '

Note that classic PonyMail can handle these. It sanitises the strings using the following code: escape_html( word:gsub("[()\"]+", "") ) i.e. () and " are eliminated, and the remainder is escaped. This is done after the strings have been parsed, as the " chars are needed to delimit phrases

sebbASF commented 3 years ago

I think this occurs because the ES7 query_string operation is being used. Its query field allows for various meta-characters; these are not being escaped correctly. The meta-chars are given here: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_reserved_characters

Rather than replacing them with spaces, which will change the search behaviour, they need to be escaped.

sebbASF commented 3 years ago

Classic ponymail uses a search of the form:

   "query": "(from:\"_item_\") OR (subject:\"_item_\") OR (body:\"_item_\")"

where _item_ is html-encoded

That works fine for all the special characters I tried. In the case of Foal, the fields are specified separately, so it would just need:

   "query": "\"_item_\""