golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.58k stars 17.61k forks source link

proposal: crypto: support 3rd-party hash algorithms #69892

Open cs-bic opened 3 days ago

cs-bic commented 3 days ago

Proposal Details

crypto/rsa.SignPSS() and similar packages take a crypto.Hash, which is hard-coded to only support known-to-Go hash algorithms. This is a total show-stopper for me, because I need to be able to support new, not-yet-existing hash algorithms. I propose a new implementation for src/crypto/crypto.go:

// ...

type customImplementation struct {
    name string

    // int is used rather than uint8 of digestSizes in order to support future
    // hash functions with large digests.
    digestSize int

    f func() hash.Hash
}

var customImplementations = make(map[Hash]customImplementation)

func (h Hash) String() string {
    switch h {
    // ...
    default:
        if implementation, exists := customImplementations[h]; exists {
            return implementation.name
        }
        return "unknown hash value " + strconv.Itoa(int(h))
    }
}

// ...

// Defines the limit of 1st-party hash functions.
const boundaryFirstParty Hash = 101

// ...

// Size returns the length, in bytes, of a digest resulting from the given hash
// function. It doesn't require that the hash function in question be linked
// into the program for hash functions known to Go authors. It does require
// that the hash function in question be linked into the program for 3rd-party
// hash functions.
func (h Hash) Size() int {
    if h < 1 || h == maxHash {
        panic("crypto: Size of unknown hash function")
    }
    if h < maxHash {
        return int(digestSizes[h])
    }
    if implementation, exists := customImplementations[h]; exists {
        return implementation.digestSize
    }
    panic("crypto: Size of unknown hash function")
}

// ...

// New returns a new hash.Hash calculating the given hash function. New panics
// if the hash function is not linked into the binary.
func (h Hash) New() hash.Hash {
    if h < 1 || h == maxHash {
        panic("crypto: requested hash function #" + strconv.Itoa(int(h)) + " is unavailable")
    }
    if h < maxHash {
        f := hashes[h]
        if f != nil {
            return f()
        }
    }
    if implementation, exists := customImplementations[h]; exists {
        return implementation.f()
    }
    panic("crypto: requested hash function #" + strconv.Itoa(int(h)) + " is unavailable")
}

// Available reports whether the given hash function is linked into the binary.
func (h Hash) Available() bool {
    if h < maxHash && hashes[h] != nil {
        return true
    }
    if _, exists := customImplementations[h]; exists {
        return true
    }
    return false
}

// ...

// RegisterCustomHash registers a function that returns a new instance of the given
// 3rd-party hash function. This is intended to be called from the init function in
// packages that implement hash functions. Registering hash functions outside of an
// init function can cause thread-unsafe behaviors.
func RegisterCustomHash(h Hash, name string, digestSize int, f func() hash.Hash) error {
    if h <= boundaryFirstParty {
        return errors.New("crypto: RegisterCustomHash within 1st-party region")
    }
    if len(name) == 0 {
        return errors.New("crypto: RegisterCustomHash of hash function without a name")
    }
    if digestSize < 1 {
        return errors.New("crypto: RegisterCustomHash of hash function with empty digests")
    }
    if f == nil {
        return errors.New("crypto: RegisterCustomHash of hash function without an allocator")
    }
    if _, exists := customImplementations[h]; exists {
        return errors.New("crypto: RegisterCustomHash of an existing hash function")
    }
    customImplementations[h] = customImplementation{
        name: name,
        digestSize: digestSize,
        f: f,
    }
    return nil
}

// ...

(note that I would submit a pull request, but I am currently on a very shoddy hotspot internet connection that prevents me from performing the necessary tasks to do so.)

There is of course an issue with this proposal: 3rd-party hash algorithms might use the same value for crypto.Hash, which would lead to errors. Ideally we would change the type definition of crypto.Hash to something more sensible than a uint, but this would break packages that use the uint to identify hash algorithms. Perhaps a checksum of the hash algorithm's name to produce a uint would be better?

gabyhelp commented 3 days ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

ianlancetaylor commented 3 days ago

CC @golang/security

mateusz834 commented 3 days ago

From your docs it seems like the RegisterCustomHash accepts a Hash parametr, so the caller has to decide on the value. What if two packages use the same one? I don't think that returning an error is okay, we should return a new Hash instead, from an auto-incrementing global variable.

cs-bic commented 3 days ago

From your docs it seems like the RegisterCustomHash accepts a Hash parametr, so the caller has to decide on the value. What if two packages use the same one? I don't think that returning an error is okay, we should return a new Hash instead, from an auto-incrementing global variable.

I mentioned the possibility of this issue at the end of my proposal. An auto-incrementing counter would guarantee uniqueness but would cause issues across machines with differing loaded 3rd-party hash algorithms when using the uint as an identifier.

cs-bic commented 3 days ago

Perhaps crypto.Hash could be made into a uint64 and 3rd-party hash algorithms would have their names be checked to be ASCII-encoded; the first seven bytes of the name could be used to create a uint64 (with zerod-bytes being used if all seven bytes are not available), with one byte being reserved for 1st-party hash algorithms. I do not know how much this would affect backwards-compatibility.

seankhliao commented 3 days ago

Can you remove the parts that are unchanged? it makes it difficult to identify the actual proposed changes.

cs-bic commented 3 days ago

Can you remove the parts that are unchanged? it makes it difficult to identify the actual proposed changes.

Done, // ... is used to indicate unchanged content.

Jorropo commented 3 days ago

Note, I edited the original post:

-```
+```go

This adds syntax highlighting.

rolandshoemaker commented 3 days ago

This is an extremely powerful feature, which seems extremely easy to misuse if you aren't really sure what you're doing, and seems likely to be of value to a very small number of people. As such I'm not sure it's something we should consider adding.

SignPSS is generally hash agnostic, but there are a large number of places where we do more complex things based on which hash was used, and expect it to be one of our predefined hashes. Documenting where this would work, and wouldn't, seems like it'd be rather complicated, and add additional overhead, with questionable value to the vast majority of Go users.

You say you "need to be able to support new, not-yet-existing hash algorithms", can you provide some more context for this? Are you experimenting with proposed hashes, or is this necessary for production systems? It seems like this is likely to be a case where simply working off a fork, with your custom hashes implemented, is probably the best option.

cs-bic commented 3 days ago

which seems extremely easy to misuse if you aren't really sure what you're doing

crypto.RegisterHash() is also easy to misuse (the documentation does not mention that it is only intended to grant access to the defined hash algorithms).

seems likely to be of value to a very small number of people

I agree, but Go has been regarded as a crypto-friendly language, and having limitations like this can be off-putting, even if done with good intentions.

but there are a large number of places where we do more complex things based on which hash was used, and expect it to be one of our predefined hashes

I understand there are places where custom hash algorithms cannot work, but there are also places where they can, and not supporting those using standardized interfaces raises the difficulty of building certain things in Go.

can you provide some more context for this?

Mostly experimentation with a virtual machine that can receive crypto-related algorithm updates validated via already-known signature algorithms. The idea is to support extreme longevity when traditional system updates are not possible. I understand this is very exotic and that a custom implementation of RSA or whatever could be written off-the-bat, but bootstrapping with crypto/rsa is a bit easier; if a new hash algorithm comes out and using it with RSA is desired, then interaction with the bootstrapped crypto/rsa demands being able to customize crypto.Hash.

It seems like this is likely to be a case where simply working off a fork, with your custom hashes implemented, is probably the best option.

I guess this makes sense.

seankhliao commented 2 days ago

but Go has been regarded as a crypto-friendly language

I believe the reputation comes from making it difficult to misuse, not being from extensible.