Call the script like this: poetry run python ecqm_dedupe.py dedupe-data --fmt CSV /tmp/test/4/small.csv /tmp/test/
Expected behavior
The script should output an excel file with all of the duplicates identified.
Actual behavior
In the results xlsx file , i see it detects duplicates only the ones that have wrong names (the ones that are the correct names, it seems to have a different cluster id) - perhaps this happens because it doesn't use birth_date to detect dupes. I tried to change deduplifhirLib/settings.py , but it didn't seem it had any effect changing config there.
Describe the bug Duplicates are not found when running on certain test data supplied by Octavian Chiorcea [coctavius@mdinteractive.com](mailto:coctavius@mdinteractive.com)
To Reproduce Steps to reproduce the behavior:
poetry run python ecqm_dedupe.py dedupe-data --fmt CSV /tmp/test/4/small.csv /tmp/test/
Expected behavior The script should output an excel file with all of the duplicates identified.
Actual behavior In the results xlsx file , i see it detects duplicates only the ones that have wrong names (the ones that are the correct names, it seems to have a different cluster id) - perhaps this happens because it doesn't use birth_date to detect dupes. I tried to change deduplifhirLib/settings.py , but it didn't seem it had any effect changing config there.