XMSS/EdDSA: Streamed Verification of Large Messages

The public Botan::PK_Verifier API currently mandates that the signature buffer is provided only after the signed message was slurped. Though, for XMSS and other algorithms, the signature buffer contains a parameter required for hashing the signed message.

Currently, this forces XMSS and EdDSA (in "Pure" mode) to buffer the entire signed message as it is coming in via ::update(). For large messages and/or restricted target platforms this might be a show-stopper.

It seems, that we cannot fix this without a change of the public API of PK_Verifier. Now, with Botan 3.0 at the horizon, it might be a good moment to do so, @randombit?

Arguably, providing the signature to PK_Verifier up-front should be optional, though. For conventional algorithms (e.g. RSA, ECDSA), it is beneficial to be able to pass the signature at the very end.

I suggest adding PK_Verifier::set_signature() that can be called by an application whenever convenient. Then, another new method like PK_Verifier::check() can perform the validation once the message is slurped using one or more calls to ::update(). The convenience methods ::verify_message() and ::check_signature() could orchestrate the low-level methods accordingly.

This would improve the support for EdDSA and XMSS at the expense of a more complicated PK_Verifier API and additional state management. Also, library users would need to be educated, that calling ::set_signature() as early as possible does have performance implications for certain algorithms. Though, hopefully, most users don't make use of the streaming interface nowadays anyway, but rely on ::verify_message() which would just continue to "do the right thing"™.

At a low level, this would force all specific PK_Ops::Verification implementations to be able to deal with the signature coming in at any time. Conventional algorithms need to store it for later validation, EdDSA and XMSS need to be prepared to buffer ::update() calls until the signature is there and then switch to a streaming mode. Both of which might be abstracted away in intermediary PK_Ops::Verification sub-classes, though.

But is there be a better alternative?

Arguably, providing the signature to PK_Verifier up-front should be optional, though.

It really has to stay optional- there are probably all kinds of applications where the signature isn't available (at least not logically, within the program control flow) until later on, and changing this would be very disruptive.

EdDSA and XMSS need to be prepared to buffer ::update() calls until the signature is there and then switch to a streaming mode. Both of which might be abstracted away in intermediary PK_Ops::Verification sub-classes, though.

This seems really complicated tbh. We'd also then have to account for the user setting the signature twice (possibly mismatching) and the performance implications of providing it late would not be so clear for users.

I think we should give the user the opportunity to provide the signature but just at one time- prior to providing any message bytes.

  void start(); // basically a no-op
  void start(const uint8_t signature[], size_t signature_len); // this is the key API

// various update calls as they exist today

      bool check_signature(const uint8_t sig[], size_t length); // current API
      bool check_signature(); // takes implicit signature, or else throws? Returns false?

The start call would have to remain optional because otherwise it breaks application flows.

This seems really complicated tbh.

Yeah, I felt the same way. Your suggestion to allow providing the signature either at the very beginning or the very end makes sense and should simplify the implementation somewhat. Though, for Dilithium or Ed25519 we'd still need to buffer the update calls when the user failed to provide the signature up-front. Enforcing up-front signatures for such algorithms (e.g. by throwing) is not an option either, given that signatures might not be available beforehand for some applications.

I'm mostly concerned about usability/discoverability of this API. Users will need to read and understand the difference and then do the right thing for the right algorithm. :(

One more wild idea: Provide two flavours of PK_Verifier -- PK_Verifier_Prefix_Signature and PK_Verifier_Postfix_Signature (names TBD), like so:

class PK_Verifier_Prefix_Signature
{
public:
  PK_Verifier_Prefix_Signature(const Public_Key& pub_key,
                               const std::vector<uint8_t>& signature, // <= signature is required for construction
                               const std::string& emsa,
                               Signature_Format format = IEEE_1363,
                               const std::string& provider = "");

  void update(...);
  bool check_signature(); // <= no more methods taking a signature in the end

  // ...
};

class PK_Verifier_Postfix_Signature
{
public:
  // basically same API as the current PK_Verifier

  PK_Verifier(const Public_Key& pub_key,
              const std::string& emsa,
              Signature_Format format = IEEE_1363,
              const std::string& provider = "");

  void update(...);
  bool check_signature(const uint8_t sig[], size_t length);

  // ...
};

That won't make the implementation of the internal PK_Ops::Verification any easier (to buffer, or not to buffer), but at least the user-facing API is clearly split into the two possible strategies and hopefully inspires people to read the docs.

randombit / botan

XMSS/EdDSA: Streamed Verification of Large Messages #3039