JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.77k stars 705 forks source link

Fixes to AhoCorasick implementation in order to pass Suffix Link tests #14189

Open jfernandrezj opened 4 months ago

jfernandrezj commented 4 months ago

Description

Added a fix to correctly include a chunk annotation when jumping from a leaf node through a Suffix Link Added a fix to correctly determine the span of a chunk annotation when jumping from a non-leaf node through a Suffix Link

Motivation and Context

https://github.com/JohnSnowLabs/spark-nlp/issues/14187

How Has This Been Tested?

Added tests that should pass and did not pass before the fix Java 11 Spark 3.4 Spark NLP 5.2.2

Types of changes

Checklist: