barisusakli / nodebb-plugin-dbsearch

A plugin that uses the database for search
16 stars 24 forks source link

Unicode support in Redis #65

Closed ShlomoCode closed 1 year ago

ShlomoCode commented 1 year ago

Currently the plugin supports unicode search (for example Hebrew) only when using a mongo database. When using redis no results are found for non-english characters. I tried to go through the code (https://github.com/barisusakli/nodebb-plugin-dbsearch/blob/master/lib/redis.js) but I couldn't figure out where the limitation comes from. @barisusakli @julianlam Could you perhaps give me a direction as to where the problem stems from? I want to move forward on this, try to write a fix for it, and submit a PR. Thanks!

barisusakli commented 1 year ago

This plugin uses https://github.com/barisusakli/redis-search when you use redis as your datastore, which is inspired by https://github.com/tj/reds. You can find more info on those repos

ShlomoCode commented 1 year ago

@barisusakli thanks! I found the problem: https://github.com/barisusakli/redis-search/blob/262f9cf9834b8c819e144f5a573c0486adf8c3b0/lib/redis-search.js#LL185C32-L185C38

return String(content).match(/\w+/g);

The usual abbreviation \w is equivalent to [a-zA-Z0-9_], and does not include Hebrew characters for example - https://regex101.com/r/CkI7AL/1. I created a PR that fixes this - https://github.com/barisusakli/redis-search/pull/2