vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.68k stars 593 forks source link

Case sensitive search not supported on index fields #30476

Open shubh9194 opened 7 months ago

shubh9194 commented 7 months ago

Is your feature request related to a problem? Please describe. when we try to search the index field with some value, Vespa returns the result as case insensitive. e.g if we are doing search for fieldA="Aabc", it also returns doc that has value "aabc" for fieldA.

Describe the solution you'd like want the case sensitive result only. Docs with fieldsA=Aabc should be returned in the search

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

jobergum commented 7 months ago

Vespa already supports case-sensitive searches using attribute fields using match:cased doc.

It's not natural to have case sensitive search against index field with text matching and linguistic integrations IMHO.

bratseth commented 7 months ago

There are use cases for it though, and what we have done recently is to make this completely the discretion of the linguistics module, but the linguistics module will have to support it and most don't. So, I think what's needed is to be able to select a linguistics module per field.

107dipan commented 1 month ago

Hi @bratseth, We are looking to enable case sensitive search on index fields only with exact match enabled. Is there a way we can achieve case sensitive search on these fields with out using a different lingiustic module since we dont need stemming and other parsing for exact field match.

bratseth commented 1 month ago

To be clear you want both of a) cased exact match, no partial match, b) on an index field not an attribute?

And if so, why can't it be an attribute?

107dipan commented 1 month ago

The only issue with using these fields as attribute is that they are stored in memory or we would need to enable paged similar to our current attributes. Since we already have a lot of attribute with paged enabled this would increase the page in out since these fields are heavily used. We can definitely try the attribute approach and liguistic as suggested but were wondering if there was any other way to enable this.

bratseth commented 1 month ago

Ok, got it. No other way right now, but it's not a lot of work to add when also doing "exact" - we can consider doing that.

nehajatav commented 7 hours ago

Feature request: Allow option to make index fields with {match: word} case sensitive