Three of the inaugural files have unanticipated encodings:
2005-Bush.txt - uses Macintosh Chinese Traditional encoding
2013-Obama.txt - uses UTF-8 encoding.
2017-Trump.txt - uses UTF-8 encoding.
These will produce character and word errors, if read in as ISOLatin1.
In Macintosh Chinese Traditional encoding, a decimal 161 byte is an escape character. The following two-byte translations are needed for the 2005-Bush.txt file:
{161,88} to -
{161,166} to '
{161,75} to ...
{161,167} to "
{161,168} to "
Three of the inaugural files have unanticipated encodings:
2005-Bush.txt - uses Macintosh Chinese Traditional encoding 2013-Obama.txt - uses UTF-8 encoding. 2017-Trump.txt - uses UTF-8 encoding. These will produce character and word errors, if read in as ISOLatin1.
In Macintosh Chinese Traditional encoding, a decimal 161 byte is an escape character. The following two-byte translations are needed for the 2005-Bush.txt file:
{161,88} to -
{161,166} to '
{161,75} to ... {161,167} to " {161,168} to "