Closed colagrosso closed 7 months ago
We need to use regex because we should be matching on word boundaries, \b
, not start/end of the whole string. I also don't think we need to guess what the user intended like with helen of troy
, that's going down a rabbit hole that can't be solved well in our simple use case. helen troy
is a bad query.
Maybe the first step here is thinking about how we expect particular queries to match. If we query renaissance male
, how does that search break down in the back end, especially when searching tags? Presumably an artwork would be tagged with renaissance
or male
but not renaissance male
.
On the other hand we may have things tagged with a single tag african american
so the space becomes a point of difficulty in whether we're searching tags or fulltext.
I think the approach is to break down a query by spaces, and search across the contents of each tag by each word with word boundary. So for the query renaissance male
we would do something like tag.Name regexp "\b(renaissance|male)\b". This would match
malebut not
femaleand it would also match the theoretical
renaissance male` tag if there was one.
Thanks a bunch, that makes sense. I tried the query for the tags that way, and it works. I think the same logic works well for the other fields, too, so I applied it there. Sorry for the bad query example above, here's a different, better one that now works: Monet Lilies
. That query doesn't work today, but it would work after this PR because the now individual words can match different fields. It returns other matches, too, but I think that's ok.
Great, looks very good, thanks!
Although now that I look again there's a small problem, when we replace non-alphanumeric characters in the query with a space. What happens when I search for the painting dressmaker's daughter
? Any painting with s
in it is returned!
Crap, sorry about that. I'll send you a fix today.
This is coming along well and is ready for a look.
I also tried matching using a regex, but I changed it back to
like
because the regex wasn't buying anything. Worse: I couldn't get the regex matching/collating to match artwork with Unicode characters.like
handles that just fine.I can change it back to using a regex so you can see what I tried. It looked roughly like this: