Open beverloo opened 9 years ago
Minor correction:
methods for converting between a unicode string to a base64-encoded representation of it
btoa
and atob
act on “bit strings”. To convert a Unicode string to base64, encode it first using an encoding of your choice, e.g. UTF-8:
const unicodeString = 'foo𝌆bar';
const textEncoder = new TextEncoder('utf-8');
const bytes = textEncoder.encode(unicodeString);
const bitString = String.fromCodePoint(...bytes);
// Well, that was awkward. But now we can finally base64-encode!
const encoded = btoa(bitString);
// → 'Zm9v8J2MhmJhcg=='
If btoa
/atob
were to be designed today, they’d probably accept/produce Uint8Array
s of bytes (which is what TextEncoder
outputs). The new methods should probably do this, unless consistency with btoa
/atob
is more important.
This seems like a proposal that's more appropriate for the Encoding Standard, in any case? Unless I am misunderstanding.
@domenic I’m not sure. There are two layers of encoding here:
@mathiasbynens yeah, that's true. I guess I was reacting to how the Encoding Standard properly separates out byte inputs/outputs from string inputs/outputs. But I see that it doesn't have any methods that are bytes -> bytes, like this would be.
A base64 encoder is similar to a text decoder, seems like. Should we just introduce a Base64Encoder/Base64Decoder pair that has a similar design to the classes from the Encoding Standard?
The HTML specification has two methods for converting between a unicode string to a base64-encoded representation of it, and vice versa.
https://html.spec.whatwg.org/multipage/webappapis.html#dom-windowbase64-btoa
The URL-safe base64 encoding (base64url), also defined in RFC4648, has been adopted by a few specifications recently. Examples include the Push API (PushSubscription serialization) and various parameters of JWK objects (EME, Web Crypto).
https://tools.ietf.org/html/rfc4648#section-5
While the contents aren't immediately intended for consumption by the web app, those which would like to now need their own conversion methods. (As trivial as that may be.)
The naming of btoa/atob doesn't make it very extensible. We could either add an argument (optional boolean urlsafe = false), or introduce methods analogous to them for the different encoding - urlbtoa/urlatob? I prefer the former.
I'd be happy to generate a pull request if you think adding these makes sense.