paulmillr / scure-base

Secure, audited & 0-deps implementation of bech32, base64, base32, base16 & base58
https://paulmillr.com/noble/#scure
MIT License
112 stars 13 forks source link

Optional Padding #4

Closed cinolt closed 1 year ago

cinolt commented 2 years ago

RFC4648 states:

In some circumstances, the use of padding ("=") in base-encoded data is not required or used.

An example of a specification that prohibits padding characters is RFC8555:

... the base64url alphabet and MUST NOT include base64 padding characters ("=").

Therefore, it would be nice if the encoding/decoding functions provided an option to omit padding characters.

paulmillr commented 1 year ago

@ukstv suggested using multibase table for naming.

That would likely break backwards compatibility and will require 2.0.

The question is whether it's optimal to have non-padded defaults (base64url). One argument is that multibase uses it but multibase is not really an authority (rfcs are)

ukstv commented 1 year ago

multibase is not really an authority (rfcs are)

I would like to provide two arguments in favour of multibase names.

  1. RFC 4648 text tells us: "This encoding may be referred to as "base64url"." I would say, this is an informal name. A predecessor RFC 3548 in the introduction section, considers itself a clarification of previously widely used "base64" encoding scheme. If we go with the spirit of the RFCs, I'd consider that clarity and unambiguity is the end goal. multibase table achieves the goal just perfectly, even if not blessed by a traditional standards body yet.
  2. multibase is used quite extensively in DID realm. When you see base64url, most probably it is a reference to multibase base64url, not the RFC version. It would be really nice to not surprise people working in the field with different names.

There is an alternative approach though to changing names in this library (was about to post it in the did-jwt PR, but got unicorned by GitHub). One writes say multibases library (a successor to now deprecated multibase) built on top of @scure/base. It exports all the right names, paddings and casings.

AlexErrant commented 1 year ago

The question is whether it's optimal to have non-padded defaults

One can optimize for different objectives, so this is ambiguous. We can optimize for community interoperability or backward compatibility... or both (by releasing a patch version and a major version bump). Or a new library as suggested. Or we can optimize for minimal effort and do none of the above :)

As a simple user, I'd be pleased with any movement.

paulmillr commented 1 year ago

I agree that the issue will need to be solved at some point.

The next step for me or anyone else who's willing to dive into this would be to analyze what's being done is competition base64url (etc) libraries and what's their default/non-default behavior.

AlexErrant commented 1 year ago
Package Downloads Base64Url default has padding? Option for padding? Option for no padding? Comment
https://www.npmjs.com/package/crypto-js 5,516,870
https://www.npmjs.com/package/pvtsutils 2,111,504
https://www.npmjs.com/package/base64url 1,955,103
https://www.npmjs.com/package/base64-url 499,979
https://www.npmjs.com/package/rfc4648 482,945 The stringify function's options are the same among all encodings, so padding is on by default
https://www.npmjs.com/package/b64u-lite 69,732
https://www.npmjs.com/package/@github/webauthn-json 29,788 Not really an encoder/decoder but whatever
https://www.npmjs.com/package/js-encoding-utils 15,906
https://www.npmjs.com/package/base64url-universal 7,343
https://www.npmjs.com/package/@juanelas/base64 7,294 Is the same function as the base64 encoder
https://www.npmjs.com/package/b2a 6,933
https://www.npmjs.com/package/safe-base64 6,637
https://www.npmjs.com/package/@hexagon/base64 4,659
https://www.npmjs.com/package/@47ng/codec 4,060
https://www.npmjs.com/package/universal-base64url 2,917
https://www.npmjs.com/package/compact-base64 2,269

Methodology: Went here, and either read the docs, source code, or simply ran the code to figure out what it did. I ignored libs with less than 1000 downloads/week, and only looked at the first page.

paulmillr commented 1 year ago

Thanks Alex, that's very helpful!

paulmillr commented 1 year ago

So I guess we should make 1.2 with non-padded versions of the functions to mitigate the issue for existing users.

Any ideas on the naming? base64urlnopad or something? @ukstv @cinolt which formats would you need? Only base64url for now?

The multiformats table seems nice and comprehensive, and the fact most libraries are using non-padded defaults also matters, so for some future 2.0 we should probably switch to the naming from the table.

AlexErrant commented 1 year ago

For 1.2, we could avoid the naming problem by following rfc4648.js's style and adding an optional opts param. And FWIW I'm only interested in base64url.

paulmillr commented 1 year ago

Opts will be a problem, since our other encoders are not really using them.

paulmillr commented 1 year ago

Done, added as base64urlnopad

mistermoe commented 1 year ago

@paulmillr can you publish an npm release so that base64urlnopad can be used? thanks so much

paulmillr commented 1 year ago

will publish the new version soon.

paulmillr commented 1 year ago

1.1.2 is out folks