Closed ahcm closed 3 years ago
Term is a value of a token targetting a specific field. By default the query parser targets all of indexed fields. (this is configurable)
You probably have 4 fields that are indexed.
You can get the field via term.field()
. The name of the field can then be fetched via schema.get_field_name(field)
Thanks!
That gives me indeed one term per field: uri:foobar title:foobar body:foobar date:foobar
A query of uri or date with uri:foobar or date:foobar gives no matches though. Why do they still turn up in the Terms list for these fields?
They should show up yes. Can you share your schema?
So if I want to get all Terms that are present in a field, do I have to iterate over the Terms checking for a match in that field?
"schema": [
{
"name": "uri",
"type": "text",
"options": {
"indexing": {
"record": "basic",
"tokenizer": "raw"
},
"stored": true
}
},
{
"name": "title",
"type": "text",
"options": {
"indexing": {
"record": "position",
"tokenizer": "en_stem"
},
"stored": true
}
},
{
"name": "body",
"type": "text",
"options": {
"indexing": {
"record": "position",
"tokenizer": "en_stem"
},
"stored": true
}
},
{
"name": "date",
"type": "text",
"options": {
"indexing": {
"record": "basic",
"tokenizer": "raw"
},
"stored": true
}
}
],
I opened a bug report in tantivy-cli because it the benchmark command printed 4 header columns but only 3 values. The missing column was num_terms. Which lead me to add code to print the Terms (which I was told on gutter is different to the original meaning). I as surprised to see 4 Terms with my test query "foobar" to print identical "foobar". I either expected only 1 term or for different ones like "Foobar", "FooBar".