Closed maxachis closed 3 years ago
Went ahead and looked at the map. One of the two issues, "Bailey Food Mart" only appears once (though misspelled as "Bailie"), so we maybe can remove that test case, although it still might be a good test case to test for, regardless of whether the issue appears. The other, for "Bellevue Farmer's Market", appears twice on the map.
@cgmoreno proposed that maybe we make an area for the tests we know all pass, and the ones that don't currently pass. Maybe have the tests that don't pass commented out and assert true for now, to avoid white noise. Something to consider!
Current status on this is that the new deduper that's been developed is working well in testing, but for whatever reason data isn't being merged when I run the merged_dataset action, at least not in the existing branch. Need to figure that out before I merge it with everything else.
After looking at it more closely, data is being merged, just not all rows I was expecting. That's more a problem for additional training, rather than a bug in the system. So I'll move ahead with merging it.
Completed pull request merging branches. Case closed.
Added two tests to test_id_duplicates.py. Both failed.
https://github.com/CodeForPittsburgh/food-access-map-data/runs/2937197991?check_suite_focus=true
Now one caveat here is that test_id_duplicates.py was updated by Will in his short yet productive time here, so it's possible I'm applying his tests incorrectly. That's something that would need to be examined.
But if not, we probably want to either fix these or decide these are acceptable issues for us to live with. We should also check the map to confirm that these duplicates do indeed appear on the map.