Open mfkp opened 2 years ago
I'm guessing somewhere around here, we would need to have the ability to add STORED as an option on text fields:
https://github.com/quickwit-oss/tantivy/blob/main/src/schema/text_options.rs#L27-L32
Here's an example I found of using highlighted snippets:
I haven't thought about this, but it's a valid usecase. It's definitely feasable, but it would significally change the API, so the correct approach requires more thought.
First of all, as you correctly said for this to work the text fields should be stored. I don't want it to be the default behavior because storing fields in the index takes space and reading stored fields is not free (as Tantivy documentation puts it Reading the stored fields of a document is relatively slow. (100 microsecs)
). So, this should be opt-in, maybe a stored
option for text
fields.
As performance goes, I would also make calculating ranges optional (e.g. index.search(query, with_match_ranges: true)
).
Also, note that in your example there is only one text field, but there might be more, so we need to return match_ranges
for every stored text field:
index.search('bersonal coder', fuzzy_distance: 1)
=>
[
{
id: "tt0118767",
match_ranges: {
text_field_1: [[21, 28]],
text_field_2: [[36, 39]],
}
}
]
And speaking of the search
method: since we need to return additional metadata with every document, it would be best to create a new entity (e.g. Tantiny::SearchResult
) that would contain this metadata along with documents ids, instead of returning an arbitrary hash.
That being said, modifying the source in your fork for your specific usecase shouldn't be very difficult. You would need to add the STORED
index option, and modify the search
method both in Rust and Ruby.
Hello, first of all, thanks for publishing this, looks to be a very interesting gem.
I'm wondering if it would be possible to return the matched text (or index of matched characters) along with the results? This would be useful in cases where a fuzzy match returns results, and I would want to highlight the matching text (or just show a snippet of text around the matching text for context) in the results.
In the readme it says:
So maybe this is not possible, but I figured I would ask anyway:
From your example:
Right now, this is how the search returns:
It would be great to return something like:
That way I could display the results in my search listing like:
I haven't dug into the source yet to see if this is possible, but I figure you'd know the limitations better and might be able to provide some input on if this is feasible.