jedisct1 / libsodium.js

libsodium compiled to Webassembly and pure JavaScript, with convenient wrappers.
Other
980 stars 141 forks source link

Missing sodium_base64_decoded_len #274

Closed lhunath closed 3 years ago

lhunath commented 3 years ago

There is an asymmetry in the API for Base64 encoding/decoding.

The API provides for a function to find the safe upper-bound for encoding binary data into base64 through sodium_bin2base64, by using sodium_base64_ENCODED_LEN or sodium_base64_encoded_len. This is a vital tool for determining the safe minimum size for const size_t b64_maxlen.

However, the API fails to provide a similar function to find the safe upper-bound for decoding a base64 string into binary data through sodium_base642bin. Consequently, there is no way to determine the safe minimum size for const size_t bin_maxlen.

This forces the user to use the size of the base64 string as the minimum buffer size, which is excessive. Furthermore, this asymmetry in the API is confusing and the documentation is silent on the topic, rendering no aid to the confusion.

lhunath commented 3 years ago

The implementation could choose to perform basic math on the size of the base64 string, or it could opt for inspecting the string itself to ensure a degree of validity.

https://gitlab.com/spectre.app/api/-/blob/29fa4a58dcbd1cd6e94939166c39695fef95c4d1/c/base64.c#L81

jedisct1 commented 3 years ago

This has been added to the documentation. Thank you!

lhunath commented 3 years ago

Hi @jedisct1, thanks for your quick response. Where can I check the change to the docs? For historic purposes, can you document the reasoning behind opting for a documentation change over making the API symmetric?

lhunath commented 3 years ago

Hmm, looks like I accidentally posted this to the wrong repo. May need to be moved to https://github.com/jedisct1/libsodium

lhunath commented 3 years ago

For future reference: https://github.com/jedisct1/libsodium-doc/commit/1fec458534167f0d6cb8f9a8227b729f448772b2

jedisct1 commented 3 years ago

The purpose of this function is to decode a secret key.

The key size is generally constant for a given algorithm, so it is known prior to decryption.

The logic is thus usually the opposite of the one you describe. The required encoded input size can be computed according to the expected key size, and if we don't have enough bytes yet, we need to read more, or if there isn't any, it's not even worth trying to decode anything. I never had any use case for doing it the other way round.

Doing it the other way round and allowing arbitrary-long input is asking for trouble. Beyond requiring dynamic memory allocations, if the application uses the decoded key without checking that its length is matching the expected one, we instantly get a vulnerability.

lhunath commented 3 years ago

That sounds like an issue that is external to the contract of this function, a contract which is limited to just decoding an arbitrary buffer of base64 characters; but I'll understand if you prefer to implicitly tie it to use specific to keys. From my view, exposing this function in your interface is inviting clients to use it for wider applications. That said, I am satisfied with just documenting the contract of the bin_maxlen parameter.