JoGall / soccermatics

Tools for visualisation and analysis of soccer tracking and event data
308 stars 36 forks source link

Parse non-ASCII characters in StatsBomb player names #12

Open JoGall opened 6 years ago

JoGall commented 6 years ago

Parse non-ASCII characters in StatsBomb (and other) data for use with soccerPosition and other future plotting functions. For example, Kylian Mbappé currently renders as Kylian Mbappé.

Ryo-N7 commented 5 years ago

hey! was doing something similar at work so thought I might throw in a few functions that might help:

x <- c(
    "Hello World", "6 Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher",
    'This is a \xA9 but not a \xAE', '6 \xF7 2 = 3', 
    'fractions \xBC, \xBD, \xBE', 'cows go \xB5', '30\xA2')

data.frame(thingy = x) %>% 
    mutate(thingy2 = stringi::stri_trans_general(thingy, "latin-ascii"))
JoGall commented 5 years ago

Thanks for this Ryo, super useful! Hopefully finally get round to fixing this today.