As mentioned in #358, if you were to search playerid_lookup("tatis", "fernando", fuzzy=True) right now, you would get duplicate rows for Fernando Tatís Jr and Sr. This is because fuzzy=True and the search doesn't produce an exact match because the correct name is Tatís with the accented í, not Tatis. Since the Chadwick names for Tatís Jr and Sr are the same, 'Fernando Tatís' is 2/5 names in fuzzy_matches when the merge is done with the player table in get_closest_names(). Each copy of the name matches with the table data for Tatís Jr and Sr, so we get duplicates for each.
The change I made was to drop the duplicate name before the merge (making the length of fuzzy_matches 4 not 5), so now the single copy of the name can match data for both Jr and Sr. Since the one copy of the name matches data for both players, we still end up returning 5 players after the merge as expected. The same effect can be seen if you were to do a fuzzy search for Vladimir Guerrero Jr and Sr, such as playerid_lookup("guerrero", "vladimi", fuzzy=True).
As mentioned in #358, if you were to search
playerid_lookup("tatis", "fernando", fuzzy=True)
right now, you would get duplicate rows for Fernando Tatís Jr and Sr. This is becausefuzzy=True
and the search doesn't produce an exact match because the correct name is Tatís with the accented í, not Tatis. Since the Chadwick names for Tatís Jr and Sr are the same, 'Fernando Tatís' is 2/5 names infuzzy_matches
when the merge is done with the player table inget_closest_names()
. Each copy of the name matches with the table data for Tatís Jr and Sr, so we get duplicates for each.The change I made was to drop the duplicate name before the merge (making the length of
fuzzy_matches
4 not 5), so now the single copy of the name can match data for both Jr and Sr. Since the one copy of the name matches data for both players, we still end up returning 5 players after the merge as expected. The same effect can be seen if you were to do a fuzzy search for Vladimir Guerrero Jr and Sr, such asplayerid_lookup("guerrero", "vladimi", fuzzy=True)
.