nalgeon / sqlean

The ultimate set of SQLite extensions
MIT License
3.76k stars 120 forks source link

arguments should be ASCII strings #110

Closed ProfChaos closed 9 months ago

ProfChaos commented 9 months ago

I keep getting this error no matter what I try to do with matching a field from a json object.

SELECT *
  FROM movies
  WHERE levenshtein(data->>'title', "star wars") <= 10
    AND deleted_at IS NULL;

I've tried cast(data->>'title' as text), data->>'title' || "", json_extract(data, '$.title'), and a couple more things to see if I could get it to work, but alas.

I could extract the field into it's own column, but it would be great not having to do that

nalgeon commented 9 months ago

The error text tells you what the problem is: arguments should be ASCII strings (consisting only of characters with codes 0-128). Movie titles probably contain non-ASCII (Unicode) characters, so fuzzy functions won't work on them.

nalgeon commented 9 months ago

You can use the translit function to convert the input string from UTF-8 to pure ASCII (though some characters may be lost):

select *
from movies
where
  levenshtein(translit(data->>'title'), 'star wars') <= 10
  and deleted_at is null;