Closed beanumber closed 4 years ago
For NCAA names I use the names used on ncaa.com. In the dict
dataset of ncaahoopR
, I have alternate spellings for several other sites. What that, plus a little bit of manual work, I should be able to help get most of the missing ones. Planning on adding some missing teams to the NCAA color palettes though that will require more manual work w/ Chrome developer tools to get the hex codes from logos. In any case, all on my to-do list for the upcoming NCAA hoops season.
Ah, OK. I was using this list:
https://en.wikipedia.org/wiki/List_of_NCAA_Division_I_institutions
Check out what I did here for MLB teams. I'd recommend adding something to reduce an abitrary string to a canonical version of the school's name. So that, for example,
standardize_name(c(
"UNC", "University of North Carolina", "North Carolina",
"University of North Carolina-Chapel Hill", "University of North Carolina, Chapel Hill"
))
would always return north-carolina tar-heels
. That way, you have some hope of matching the names from any source.
Not saying it will be easy...
Thanks, can try and do something like that.
Ugh, this is going to require some name standardization. Maybe IPEDS can help?