Terms at beginning or end of a snippet are cut off

Original Request

I just noticed an issue with snippets returned by the NLP pipeline where the tagged term gets cut from the snippet. This happens when the term is right at the beginning or the end of a snippet. Take a look at the following snippets for file date 2024-10-08:

PPAIN-361873: term is 'bereavement'. Note starts as "Bereavement: Session 5: Finding Your Guides ...", but the snippet starts as "Session 5: ..."

PPAIN-367227: term is 'insomnia' but snippet ends with "... patient complains of:" instead of "... patient complains of: Insomnia".

If the tagged term is a phrase, only part of the term makes it into the snippet. For example:

PPAIN-362292: term is 'doing good'. Note starts as "Doing good, so you look ...", but the snippet starts as "good, so you look ...".

PPAIN-348703: term is 'in good spirits' but snippet ends with "... in good", instead of "... in good spirits".

It happens across concepts. I just happened to work with PPAIN. Just for 2024-10-08 I'm counting 1498 for HOUSING, 804 for CGS, and >237 for FALL, to name a few.

I will write test cases to check for this issue.

Action Requested TBD

Requestor 3ST Team (Esther Meerwijk)

Additional Actions

[ ] #105
[ ] #106

suzytamang / clever-rockies

Terms at beginning or end of a snippet are cut off #104