golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.39k stars 17.71k forks source link

proposal: hash: add XOF interface #69518

Open FiloSottile opened 2 months ago

FiloSottile commented 2 months ago

Background

An extendable output function (XOF) is a hash function with arbitrary or unlimited output length. They are very useful for tasks like key derivation, random number generation, and even encryption.

We have two (or rather three) XOF in x/crypto already: SHAKE in x/crypto/sha3 and BLAKE2X in x/crypto/blake2b and x/crypto/blake2s. In third-party modules at least KangarooTwelve in github.com/cloudflare/circl/xof/k12 and BLAKE3 in lukechampine.com/blake3 and github.com/zeebo/blake3 see some use.

The SHAKE XOFs return a ShakeHash interface.

type ShakeHash interface {
    hash.Hash

    // Read reads more output from the hash; reading affects the hash's
    // state. (ShakeHash.Read is thus very different from Hash.Sum)
    // It never returns an error, but subsequent calls to Write or Sum
    // will panic.
    io.Reader

    // Clone returns a copy of the ShakeHash in its current state.
    Clone() ShakeHash
}

The BLAKE2X XOFs return a blake2[bs].XOF interface.

type XOF interface {
    // Write absorbs more data into the hash's state. It may panic if called
    // after Read.
    io.Writer

    // Read reads more output from the hash. It returns io.EOF if the limit
    // has been reached.
    io.Reader

    // Clone returns a copy of the XOF in its current state.
    Clone() XOF

    // Reset resets the XOF to its initial state.
    Reset()
}

Proposal

[!IMPORTANT] Current proposal at https://github.com/golang/go/issues/69518#issuecomment-2429048538.

Having a standard library interface for XOFs would help prevent fragmentation and help building modular higher-level implementations (although deployments should generally select one concrete implementation).

package hash

type XOF interface {
    // Write absorbs more data into the XOF's state. It panics if called
    // after Read.
    io.Writer

    // Read reads more output from the XOF. It may return io.EOF if there
    // is a limit to the XOF output length.
    io.Reader

    // Reset resets the XOF to its initial state.
    Reset()
}

Notes

The proposed interface is a subset of the two existing ones, so values from those packages can be reused. It is also compatible with the K12 implementation. https://go.dev/play/p/AtvfO8Tkbgp

Sum and Size (from ShakeHash) are not included because XOFs don't necessarily have a "default" output size. BlockSize might potentially be useful but depends on the implementation anyway, as is not worth breaking compatibility with blake2[bs].XOF.

Clone is not included because the existing interfaces return an interface type from it. (Maybe this would have been doable with generics if x/crypto/sha3 and x/crypto/blake2[bs] returned concrete implementations rather than interfaces, but we don't want to make every use of hash.XOF generic anyway.) I will file a separate proposal to add hash.Clone and hash.CloneXOF as helper functions.

Note however that the BLAKE3 implementations differ in that they return the Reader from a method on the Writer. This is probably to allow interleaving Write and Read calls.

h := blake3.New()
h.Write([]byte("foo"))
d := h.Digest()
h.Write([]byte("bar"))
d.Read(...) // won't include bar

As long as we add hash.CloneXOF or expose Clone on the underlying XOF implementations (which both ShakeHash and blake2[bs].XOF do), cloning can be used to the same effect (with a little less compile-time safety).

h := spiffyxof.New()
h.Write([]byte("foo"))
d := h.Clone()
h.Write([]byte("bar"))
d.Read(...)
// careful not to call d.Write

/cc @golang/security @cpu

cpu commented 2 months ago

Having a standard library interface for XOFs would help prevent fragmentation and help building modular higher-level implementations

+1

Sum and Size (from ShakeHash) are not included because XOFs don't necessarily have a "default" output size. BlockSize might potentially be useful but depends on the implementation anyway, as is not worth breaking compatibility with blake2[bs].XOF.

This makes sense to me as an argument for why both should be avoided here.

Note however that the BLAKE3 implementations differ in that they return the Reader from a method on the Writer. This is probably to allow interleaving Write and Read calls.

It took me a minute to map this description to the concrete APIs. For lukechampine.com/blake3 this is the Hasher used for writing returned from New() having an XOF() fn that yields an OutputReader. For zeebo/blake3 this is the Hasher having a Digest() fn that yields a Digest

As long as we add hash.CloneXOF or expose Clone on the underlying XOF implementations (which both ShakeHash and blake2[bs].XOF do), cloning can be used to the same effect (with a little less compile-time safety).

Also sounds reasonable to me but is it necessary/preferred to split the proposal to add the XOF interface from the proposal to support cloning helpers? Perhaps there's a policy reason for doing this I'm overlooking? From my perspective it seems like the ability to clone intermediate state ends up being important to certain use-cases for the XOF interface.

For instance in the proposed docstring for the embedded io.Writer interface you wrote:

// It panics if called after Read.

I think this would be a place where it would be beneficial to point to the clone helper to support the interleaved read/write use-case workaround from the spiffyxof example.

gabyhelp commented 2 months ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

FiloSottile commented 2 months ago

Also sounds reasonable to me but is it necessary/preferred to split the proposal to add the XOF interface from the proposal to support cloning helpers?

I didn't want to condition hash.XOF on hash.Clone and vice-versa. If hash.XOF is accepted but hash.Clone is not, having hash.CloneXOF would be weird, so I tied hash.CloneXOF to hash.Clone. XOFs can still be cloned by dropping down to the underlying implementation, like in the spiffyxof example. (Actually, calling the underlying Clone reads better because it involves no error handling.)

seankhliao commented 2 months ago

The interface is fairly generic, I think bytes.Buffer would also satisfy it?

FiloSottile commented 2 months ago

The interface is fairly generic, I think bytes.Buffer would also satisfy it?

Yes, agreed. I'm afraid we have to either make it generic like this, or lose ShakeHash and blake2.XOF compatibility, since we can't expand those because they are interfaces, not concrete types.

FiloSottile commented 1 month ago

Had some time to stew over this. New proposal, with a BlockSize method to make it more specific.

package hash

type XOF interface {
    // Write absorbs more data into the XOF's state. It panics if called
    // after Read.
    io.Writer

    // Read reads more output from the XOF. It may return io.EOF if there
    // is a limit to the XOF output length.
    io.Reader

    // Reset resets the XOF to its initial state.
    Reset()

    // BlockSize returns the XOF's underlying block size.
    // The Write method must be able to accept any amount
    // of data, but it may operate more efficiently if all writes
    // are a multiple of the block size.
    BlockSize() int
}

This is a superset of blake2.XOF, but we can add a BlockSize method to the underlying implementation, and document that blake2.XOFs can be safely interface-upgraded to hash.XOF.

aclements commented 1 month ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings.

FiloSottile commented 1 month ago

Depending how #69521 goes, we might decide to add a Clone method too. It would break compatibility with x/crypto/sha3 and x/crypto/blake2[bs] but at the end of the day we can say "use the new stdlib packages if you need the interface" given #65269. Like #69521, this is not urgent for Go 1.24 as long as it doesn't influence #69982.

aclements commented 3 weeks ago

Is XOF a well-known term? Could we use a more readable name like hash.Extendable?

rsc commented 2 weeks ago

Talked to @FiloSottile about this and we agreed to leave this for Go 1.25.

aclements commented 5 days ago

Is XOF a well-known term? Could we use a more readable name like hash.Extendable?

Talked with @rolandshoemaker and it sounds like "XOF" is the industry-standard term for this, so we should probably stick with that.

Another question is, do we actually need a defined interface for this? It would be nice if we had some concrete consumers of this interface. On the other hand, this clearly parallels hash.Hash, which has many uses, so I don't think this is a big concern.