Open lduf opened 4 months ago
AI-Assisted Response:
The issue describes a situation where there is an existing system that allows for the comparison of different fields using two algorithms: one based on Levenshtein distance that compares all fields using weighted comparison, and another that utilizes phonetic comparisons but only compares the Best Descriptive Value (BDV). The current system requires the algorithm to be manually selected in the code, however, the desired solution is to enable users to choose the algorithm dynamically through a user interface.
The user wants to enhance the system by allowing end-users to select the comparison algorithm of their choice using a dropdown menu in the interface. This feature would be part of a team's settings and could affect how duplicates are identified within the team's data.
ListControl
model.forms.form
dropdown element.Algorithms to be used in this feature should conform to the same structure in terms of method signatures and return types for interchangeability.
This is a feature request.
🌍 #479 should implement new test to check :
* same items have high similarity * two very different items have low similarity
Description of the desired solution We had a first algo that used the Levenstein distance between different fields. This method had the advantage of comparing all fields, not just the best descriptive value (BDV) and applies weighted comparison. Then we worked on a solution that used phonetics to compare the different items, but this method only compared BDV. For now, it's up to us to choose (in the code) which algo to run. And once we've chosen an algo, that's the one that's run, period. What would be nice is to be able to let the user choose the algo.
In duplicate, we could imagine a dropdown with the algos and the user choosing which algo to run on the data in his team.
Resolution path Top of my head :
This is just a resolution idea that might need to be discussed. Duplicate algorithm should have the same structure (same signature, same return type, ...) so it is not a problem to call one or another.
If the request is associated with a problem, please specify it.
Additional information