Open sainyam opened 1 year ago
The current system implementation considers dates as string, where hyphens are replaced by space to calculate token-based similarity. Similarity-based threshold generates many false positives, e.g. 13/jan/2023 and 13-jan-2020 are considered to join.
The current system implementation considers dates as string, where hyphens are replaced by space to calculate token-based similarity. Similarity-based threshold generates many false positives, e.g. 13/jan/2023 and 13-jan-2020 are considered to join.