Closed ghost closed 6 years ago
@syk0saje Will look into this tomorrow!
@syk0saje, you will get more than 1 row for the results. The result of the matcher will contain each interlevel match as a separate row. So in this case, if it has a match or multiple matches for all columns, it will return three rows: one for barangay, municity, province.
For your test dataset, since those are all pretty close and we will get matches for all fields, we'll get four of these:
code score location interlevel province_code city_municipality_code
index
0 012800000 100 ILOCOS NORTE Prov 012800000 012800000
0 012801000 100 ADAMS Mun 012800000 012801000
0 012801001 100 ADAMS (POB.) Bgy 012800000 012801000
After 0.2, I'm planning on refactoring the code to de-couple the matching-specific algo from the actual app so it'll be easier for us to test other matching algorithms that we can think of
I see. This does not fulfill the following spec though: "If it's an exact string match, do not include near matches". How do we fix this so that it only returns the 3 exact matches?
There are four rows in the test dataset so it returns matches for all of those rows (3 rows per each row of the test dataset since we were able to match them all with no multiple matches.). In this case, if we just have the first row in the test dataset, we'll get just the three rows above.
Ahh, got it. Thanks. Will revise and make a new pull request.
Need feedback on why the test is failing.