Closed Montana closed 1 year ago
The best approach is generally to have a canonical name for each team, and a lookup table (probably stored in a .csv file) that maps alternate spellings to your canonical names. Then you use that table to link datasets.
Hey folks,
So my main question is basically how can I norm the team names in the two sets easily, without having to analyze all the differences "by hand" and hardcode "replace"-operations on one of the sets?
Dataset1
is downloadable here: https://data.fivethirtyeight.com/#soccer-spi.Dataset2
is not available freely, but it looks like this: