Closed alephzed closed 4 years ago
Hi Adam,
For Dates and Numbers this feature is supported in the latest version by neighborhoodRange
attribute in the Element. It defaults to 0.9 .
In the test, you can increase it to 0.901 to get the results you are looking for
List<Document> documentList1 = getTestDocuments(dates, DATE, 0.901);
I noticed a bug where you can't set the value beyond 0.91 . I'll fix it in the upcoming release .
Thanks, Manish
Thank you Manish, manipulating the neighborhoodRange parameter is what I needed.
I am trying to understand how to change the threshold matching for a set of 3 dates. "07/15/2019", "01/01/2020", and "01/02/2020". The matching score always comes back from 1.0 from for all three dates. I have tried changing the threshold on the Element level and the Document level and it doesn't make a difference. How can I change the matching so that a date 07/15/2019 does not match 01/01/2020 with the same score that 01/01/2020 matches 01/02/2020? Here is my sample code: Similar to the junit test I found in your project, but modified for a springboot application.
The output is always (even if I modify the threshold from 0.1 to 0.9: Data: {[{'Mon Jul 15 00:00:00 MDT 2019'}]} Matched With: {[{'Wed Jan 01 00:00:00 MST 2020'}]} Score: 1.0 Data: {[{'Mon Jul 15 00:00:00 MDT 2019'}]} Matched With: {[{'Thu Jan 02 00:00:00 MST 2020'}]} Score: 1.0 Data: {[{'Wed Jan 01 00:00:00 MST 2020'}]} Matched With: {[{'Mon Jul 15 00:00:00 MDT 2019'}]} Score: 1.0 Data: {[{'Wed Jan 01 00:00:00 MST 2020'}]} Matched With: {[{'Thu Jan 02 00:00:00 MST 2020'}]} Score: 1.0 Data: {[{'Thu Jan 02 00:00:00 MST 2020'}]} Matched With: {[{'Mon Jul 15 00:00:00 MDT 2019'}]} Score: 1.0 Data: {[{'Thu Jan 02 00:00:00 MST 2020'}]} Matched With: {[{'Wed Jan 01 00:00:00 MST 2020'}]} Score: 1.0