Open rlichainfotel opened 1 year ago
The branch is : https://github.com/infotel4iarc/CanReg5/tree/C202304_duplication_search
A script to generate random records is created in order to populate the database with a huge amount of record. The huge amount of records will make the difference of execution time easily distinguishable, which will make the time comparasion easier
A check box is added to the search variable panel to lock the variable during duplication search.
The blocking feature is now functional for person search, the search variables can be changed in tools -> database structure. However, a restart is necessary to make the modification effectif. The blocking feature will keep only records matches exactly the orignal record's value for the blocked variable.
An inconvenience was noticed during the test:
The issue with the loading modal not being displayed when the user clicks the "person search" button has been fixed. The action to trigger both the search and the waitFrame caused the problem. The waitFrame could not be displayed before executing the database search because everything is happening within an Action. As "showing the waitFrame" is also an action, the latter was added to the execution queue after the action to search the duplicates. Finally, the waitFrame shows up right after the search is completed and disappears instantly.
Using a SwingWorker solved the issue since SwingWorkers can be launched in a background thread simultaneously with the "person search" action.
Update: The loading frame is now closed before displaying the results of the PersonSearch refactoring had to be done so split the runPersonSearch() method in two + update of the java documentation
Update: It is now possible to specify a margin of error on year on the date of birth. Instead of having a default range set to 1 year on unblocked personSearch, the user can select the range of error himself. If the date has been blocked, only the dates around the selected date will be fetched from the database.
updated:
There exists a matching algorithm using weights on different variables to establish a matching score between multiple (not-exactly) duplicate records. For example, Soundex is used for name variables.
The objective is to improve the algorithm performance:
Be mindful to have measurements before and after improvements to demonstrate the progress.