The url_search field is of type path, which uses the WordDelimiterFilterFactory, which splits things like foo/bar_ZooBoom.png to [foo, bar, Zoo, Boom, png], which are then lowercased to [foo, bar, zoo, boom, png].
This all works fine with a search for e.g. url_search:(foo bar png), but the CamelCase-part (ZooBoom) is not handled properly. If the search is for url_search:(zooboom) or url_search:(Zooboom), there will be a hit, but if the search is for url_search:(zoo boom) or url_search:(ZooBoom), there won't be. Very counter-intuitive.
The filter passes the tokens from a CamelCased String ZooBoom as [Zoo, Boom] without any distance between the tokens (normally the distance is 1): They are two separate tokens, but are treated for most purposes as one. Querying for zooboom or Zooboom just means standard lower-casing matches the collapsed indexed tokens. Querying for zoo boom does not match as they gets parsed as two tokens with 1 as distance. I haven't figured out why ZooBoom does not work.
The
url_search
field is of typepath
, which uses the WordDelimiterFilterFactory, which splits things likefoo/bar_ZooBoom.png
to[foo, bar, Zoo, Boom, png]
, which are then lowercased to[foo, bar, zoo, boom, png]
.This all works fine with a search for e.g.
url_search:(foo bar png)
, but the CamelCase-part (ZooBoom
) is not handled properly. If the search is forurl_search:(zooboom)
orurl_search:(Zooboom)
, there will be a hit, but if the search is forurl_search:(zoo boom)
orurl_search:(ZooBoom)
, there won't be. Very counter-intuitive.The filter passes the tokens from a CamelCased String
ZooBoom
as[Zoo, Boom]
without any distance between the tokens (normally the distance is 1): They are two separate tokens, but are treated for most purposes as one. Querying forzooboom
orZooboom
just means standard lower-casing matches the collapsed indexed tokens. Querying forzoo boom
does not match as they gets parsed as two tokens with 1 as distance. I haven't figured out whyZooBoom
does not work.The WordDelimiterFilterFactory is deprecated in favour of WordDelimiterGraphFilterFactory, but a simple switch to the new factory & a re-index on a test-corpus did not solve the problem. More investigation needed.