microbit-foundation / microbit-fs

A TypeScript library to manipulate the micro:bit MicroPython filesystem.
https://microbit-foundation.github.io/microbit-fs/
MIT License
12 stars 1 forks source link

Support characters >1 byte #5

Closed microbit-rosslowe closed 5 years ago

microbit-rosslowe commented 5 years ago

Create a new test for issue #1 , testing if strToBytes() supports characters longer than 1 byte.

microbit-carlos commented 5 years ago

Great, thanks Ross! I normally like follow the AAA pattern (Arrange Act Assert), which you are pretty much following already here, where the only difference is having a blank line between sections. https://jamescooke.info/arrange-act-assert-pattern-for-python-developers.html

Will you continue working on the fix in this branch/PR?

microbit-rosslowe commented 5 years ago

Latest commit adds a new test for bytesToStr(), tests for 3 byte characters, and adds a fix for #1

microbit-carlos commented 5 years ago

That's a nice polyfill! It makes me a bit uneasy that it is marked as deprecated, but if it works well I think it's fine. https://www.npmjs.com/package/text-encoding

Did you have a look or try other ones as well?

microbit-rosslowe commented 5 years ago

@microbit-carlos I looked at using https://github.com/mathiasbynens/utf8.js but the way it worked with strings was a little confusing.

microbit-carlos commented 5 years ago

Yeah, I agree, the \x notation from the utf-8 library is not easy to work with, very glad to see that TextEncoder takes and returns Uint8Arrays. I was wondering if there were other TextEncoder polyfills that were still maintained, just to check if this one is the best option.

microbit-rosslowe commented 5 years ago

I've had a look at using https://github.com/solderjs/TextEncoderLite but it looks like it isn't being actively maintained, and didn't seem to work properly (it was also last published 1 year ago whereas the one used here was 5 months ago). TextEncoding has 1.2 million weekly downloads, so it seems pretty popular. I also looked at fast-text-encoding which had similar issues and was lasted updated September 2017.

microbit-carlos commented 5 years ago

@microbit-rosslowe Could we record here the licenses for the new dependencies (the polyfill and the types).

microbit-rosslowe commented 5 years ago

New dependency: https://www.npmjs.com/package/text-encoding We can choose from unlicensed or Apache: https://github.com/inexorabletash/text-encoding/blob/master/LICENSE.md

microbit-carlos commented 5 years ago

Great, and the @types/text-encoding package is MIT licensed: https://www.npmjs.com/package/@types/text-encoding https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/LICENSE.

It all looks good, if you are happy with the PR you can change the draft state and I can merge.

microbit-carlos commented 5 years ago

@microbit-rosslowe This dependency increased the built file from 80Kbs to 700Kbs.

Most of it comes from this file and looks like rollup tree-shaking cannot really remove any of it (since it only uses a small part of the object during runtime, but rollup has no way of knowing that).

Could we find a lighter polyfill? Ideally that only does UTF-8? Or that can easily be trimmed by rollup.

microbit-rosslowe commented 5 years ago

We can use text-encoding-lite, which reduces build size to around 88Kbs. Version 1.0.2 hasn't been published to npm but works nicely. See #6