Proposal: Use padding free Base64 encoding/decoding

baranyildirim commented 7 months ago

Hey folks,

I think I've found an issue with how base64 encoding/decoding works with Biscuit tokens. Currently, the encode/decode config used for Base64 operations is invoked as follows:

base64::encode_config(slice, base64::URL_SAFE) in biscuit-auth/src/token
base64::decode_config(slice, base64::URL_SAFE) in biscuit-auth/src/token

The problem here is that the default encode/decode configs use padding. The padding character for base64 is the = character, which is not url-safe.

This means that any token that requires padding cannot be correctly used in contexts with URLs.

For example, take the token on the biscuitsec front page (== at the end):

En0KEwoEMTIzNBgDIgkKBwgKEgMYgAgSJAgAEiAs2CFWr5WyHHWEiMhTXxVNw4gP7PlADPaGfr_AQk9WohpA6LZTjFfFhcFQrMsp2O7bOI9BOzP-jIE5PGhha62HDfX4t5FLQivX5rUhH5iTv2c-rd0kDSazrww4cD1UCeytDSIiCiCfMgpVPOuqq371l1wHVhCXoIscKW-wrwiKN80vR_Rfzg==

When used with URL encoding, this token becomes:

En0KEwoEMTIzNBgDIgkKBwgKEgMYgAgSJAgAEiAs2CFWr5WyHHWEiMhTXxVNw4gP7PlADPaGfr_AQk9WohpA6LZTjFfFhcFQrMsp2O7bOI9BOzP-jIE5PGhha62HDfX4t5FLQivX5rUhH5iTv2c-rd0kDSazrww4cD1UCeytDSIiCiCfMgpVPOuqq371l1wHVhCXoIscKW-wrwiKN80vR_Rfzg%3D%3D

Now, this token can no longer be decoded using the biscuit-rust library. For example:

use biscuit_auth::Biscuit;
biscuit_auth::UnverifiedBiscuit::from_base64("En0KEwoEMTIzNBgDIgkKBwgKEgMYgAgSJAgAEiAs2CFWr5WyHHWEiMhTXxVNw4gP7PlADPaGfr_AQk9WohpA6LZTjFfFhcFQrMsp2O7bOI9BOzP-jIE5PGhha62HDfX4t5FLQivX5rUhH5iTv2c-rd0kDSazrww4cD1UCeytDSIiCiCfMgpVPOuqq371l1wHVhCXoIscKW-wrwiKN80vR_Rfzg%3D%3D").unwrap();

called `Result::unwrap()` on an `Err` value: Base64(InvalidByte(218, 37))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

If we used the non-URL encoded version, the token is correctly parsed.

Any token that has been padded and is used with a URL (e.g. from a web-browser) cannot be parsed by the library.

To resolve this, we could:

Change the existing base64 encoding function to produce un-padded output and the base64 decoding function to accept both padded and un-padded inputs. This can be accomplished by switching to
- base64::encode_config(slice, base64::URL_SAFE_NO_PAD) in biscuit-auth/src/token
- base64::decode_config(slice, base64::URL_SAFE_NO_PAD) in biscuit-auth/src/token
Create a new set of functions Biscuit::to_base64_urlsafe() and Biscuit::from_base64_urlsafe().
Add a flag to Biscuit::to_base64 and Biscuit::from_base64 to control padding behavior.

Changing the default behavior might be risky, so 2 or 3 might be easier to implement.

Personally, I think that padding should not be used at all. See below from the creator of the rust-base64 crate:

The = pad bytes, on the other hand, are entirely a self-own by the Base64 standard. They do not affect decoding other than to provide an opportunity to say "that padding is incorrect". Exabytes of storage and transfer have no doubt been wasted on pointless = bytes https://github.com/marshallpierce/rust-base64

If we don't mind getting rid of the padding completely, option 1 would be the simplest solution. Otherwise, I have a PR for option 3. Option 2 is also viable if we don't want to change the interface of the existing functions.

divarvel commented 7 months ago

Thanks for the report. The spec indeed mentions the preferred encoding to be url-safe base64, which refers to base64 with an URL-safe alphabet, according to the relevant RFC. This base64-variant still uses = as padding char, which is indeed a self-own from the base64 spec, no doubt about it.

So, while not being a bug, I think the current situation could be improved. My suggestion:

make serialization not use padding in biscuit rust
continue accepting padding when deserializing in biscuit rust
modify the spec accordingly (libs should not use padding when serializing, but must accept both padded and unpadded values)

Optionally we could add a padded variant for encoding but i don't think it's warranted.

wdyt?

@Geal

baranyildirim commented 7 months ago

I changed the title to reflect the nature of the issue. For my use-cases, I've solved this issue by discarding all the = characters doing the issuance process. I think anyone who uses biscuits to authenticate browser based clients will have to do the same. This usability improvement would be nice.

modify the spec accordingly (libs should not use padding when serializing, but must accept both padded and unpadded values)

Yeah, I think the spec change would be great. My only concern would be maintaining backwards compatibility for applications that are already deployed - I guess accepting both padded and un-padded values should take care of that?

Geal commented 6 months ago

accepting padded and non padded in deserialization, but only producing non padded base64 should be fine. The option would only add unneeded complexity and require a new major version since it changes the public API, so let's not add an option for that

baranyildirim commented 6 months ago

accepting padded and non padded in deserialization, but only producing non padded base64 should be fine.

This is only fine if you are assuming people are only using Rust right? For example, if we are using a Rust token issuer that starts to not pad base64, will the clients/consumers from other languages be able to parse it? I'm fine with making this change, but I just want to make sure we understand that it can break backwards compatibility.

biscuit-auth / biscuit-rust

Proposal: Use padding free Base64 encoding/decoding #211