It was taking the utf-16 hex codes (\uD859 and \uDFCC) and converting them to the utf-32 hex code (\U000267cc) behind the scenes. I have methods like repr_string and repr_bytes and I might want to add some utf-8 (bytes), utf-16 (the \u values) and utf-32 (the \U values) methods just so you can get more information about the character. To see how all these come together, you can use fileformat.info and these are some pages I had open:
I was seeing some interesting behavior when python2 had only unicode
ucs2
support:It was taking the utf-16 hex codes (
\uD859
and\uDFCC
) and converting them to theutf-32
hex code (\U000267cc
) behind the scenes. I have methods likerepr_string
andrepr_bytes
and I might want to add some utf-8 (bytes), utf-16 (the\u
values) and utf-32 (the\U
values) methods just so you can get more information about the character. To see how all these come together, you can use fileformat.info and these are some pages I had open:search: