bbcmicrobit / micropython

Port of MicroPython for the BBC micro:bit
https://microbit-micropython.readthedocs.io
Other
602 stars 284 forks source link

degree symbol missing from version 2 #774

Closed rhubarbdog closed 1 year ago

rhubarbdog commented 2 years ago

when i print a temperature or angle i like to use the degrees symbol °, i use u'\x0b' which works with version 1 microbits.

whilst on version 2 micobits print("21%sC" % u'\xb0') yeilds 21�c

microbit-carlos commented 1 year ago

Thanks for the report @rhubarbdog!

I think the first to cover is that the UTF-8 value for ° is 0xC2B0, but it also happens that the Unicode code point is 0x00B0, so using \xb0 as the scape code works as well for ° (so basically chr("°") == 0xb0 and "°".encode("utf-8") == b'\xc2\xb0'):

Using this script below I've checked the output in the micro:bit Python editor, with the WebUSB serial terminal and seems to work more or less well: https://python.microbit.org

import microbit

def is_v2():
    return hasattr(microbit, "microphone")

print("String with escape sequence:")
if is_v2():
    # Currently an issue in V2
    # https://github.com/bbcmicrobit/micropython/issues/775
    print("code point: V2 has an issue with ord()")
else:
    print("code point: {}".format(hex(ord("°"))))
print("21{}C".format(u"\xb0"))
print("22\xb0C")

print()

print("Bytes with UTF-8 encoding:")
if is_v2():
    # V1 doesn't have the encode/decode methods in the string/bytes class
    print("UTF-8 encoded value: {}".format("°".encode("utf-8")))
    print(b"23\xc2\xb0C".decode("utf-8"))
    print("24{}C".format(b"\xc2\xb0".decode("utf-8")))
    print("25\xc2\xb0C")  # This isn't correct, it shouldn't work, but it does on the micro:bit
else:
    print("V1 doesn't have encode/decode methods\n\n\n")

print()

print("Editor encoding value into source code")
print("26°C")
Output V1 Outpu V2
image image

And to confirm that for print("26°C") the editor encodes the values in the source code, we can see the 0xC2B0 value when reading the data back from flash:

image

On the other hand, using a serial terminal in macOS, that let's me set the text encoding shows this:

V1 terminal ASCII V1 terminal UTF-8
image image
V2 terminal ASCII V2 terminal UTF-8
image image

It looks like maybe V1 is encoding the output in UTF-8 and V2 isn't?


Edit: Got a bit confused with the encodings, I've updated the example code to fix that and better show the differences between V1 and V2.

dpgeorge commented 1 year ago

This should be fixed now that unicode is enabled on v2.

microbit-carlos commented 1 year ago

Thanks Damien!

@rhubarbdog I'll close this as resolved in the V2 MicroPython codebase, but it will take a bit of time before we update MicroPython in the Python Editor.

If you have been using MicroPython with local tools like microFS you can download and test the latest CI build: https://github.com/microbit-foundation/micropython-microbit-v2/actions/runs/3312585708