blugelabs / bluge

indexing library for Go
Apache License 2.0
1.88k stars 122 forks source link

How to Search For Empty Fields #107

Closed Aleck-Sun closed 2 years ago

Aleck-Sun commented 2 years ago

Hi, I've been trying to create a query that searches for documents that have an empty field. Currently, when creating the index I am just setting a documents' empty fields to a specific keyword like "EMPTYBLUGEFIELD" and creating a query for it. I was wondering if there was a more proper way to perform this search?

What I've been doing: When creating index:

If field != "" {
    doc.AddField(bluge.NewKeywordField("Field", "FieldValue").StoreValue())
} else {
        doc.AddField(bluge.NewKeywordField("Field", "EMPTYBLUGEFIELD").StoreValue())
}

When searching:

query.AddShould(bluge.NewTermQuery("FieldValue").SetField("Field"))
query.AddShould(bluge.NewTermQuery("EMPTYBLUGEFIELD").SetField("Field"))
query.SetMinShould(1)
mschoch commented 2 years ago

Hello,

So, there is nothing wrong with this approach, but it is worth pointing out that when using the keyword analyzer, the empty value is fine for indexing and searching as is. The keyword analyzer does not change the input at all, and the empty string is still a valid term (length 0).

I've added a test-case to illustrate this here: https://github.com/blugelabs/bluge/pull/110

This should perform the same, so if you already got it working there probably isn't a compelling reason to change, unless you just want to simplify things.

Aleck-Sun commented 2 years ago

I see, thank you so much for the help!