apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.61k stars 1.02k forks source link

Prevent DefaultPassageFormatter from taking shorter overlapping passages #13384

Closed zkendall closed 3 months ago

zkendall commented 4 months ago

Description

There is a bug in the DefaultPassageFormatter. It will take an passage even if it results in a shorter or more fragmented passage (highlight). The one scenario I have is when there is a phrase match which overlaps with some term matches.

This is demonstrated in the attached unit tests.

github-actions[bot] commented 3 months ago

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

zkendall commented 3 months ago

@stefanvodita Thanks. I had to resolve a conflict after your approval. I see your approval wasn't dismissed, so I guess you don't have "write access" to approve. Do you know who does or how I can get their attention?

stefanvodita commented 3 months ago

@zkendall - I just wanted to wait a couple days to give anyone else who wanted to review a chance to do so. I've merged the change now.