<gco:CharacterString>TC - Canada’s
National Highway System</gco:CharacterString>
I used this sort of thing to clean it up:
from HTMLParser import HTMLParser
unescape = HTMLParser().unescape
confused = '''TC - Canada’s
National Highway System'''
print ' '.join(p.strip() for p in unescape(confused).encode('cp1252').decode('utf8').split(u'\n'))
e.g. in the XML:
I used this sort of thing to clean it up: