tink-crypto / tink

Tink is a multi-language, cross-platform, open source library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse.
https://developers.google.com/tink
Apache License 2.0
13.47k stars 1.18k forks source link

Vault Regexp does not allow for dashes (or leading underscore). #531

Closed OriPekelman closed 1 year ago

OriPekelman commented 3 years ago

Help us help you

Describe the bug Using Tink with Hashicorp Vault (hcvault) would fail if the vault host name contains dashes.

To Reproduce

Given

VAULT_ADDR= hcvault://cuddly-fish-61.example.com

Expected behavior

Client works

Error messages, stack traces, etc.

keyset.Handle: encrypted failed: malformed keyURL

The problem is at https://github.com/google/tink/blob/dca70f855d483d4350e222e2754db2f9e63c9ecb/go/integration/hcvault/hcvault_aead.go#L118

with:

var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/*([a-zA-Z0-9.:]+)/(.*)$", vaultPrefix))

Which is too restrictive and does not allow for dash correcting it to:

var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/*([a-zA-Z0-9.:-]+)/(.*)$", vaultPrefix))

Should solve the issue. And for extra points in case someone wants to use an RFC compliant host name with a leading underscore...

var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/ ?_*([a-zA-Z0-9.:-]+)/(.*)$", vaultPrefix))

Should cover the case.

Version information

Additional comment

I am not sure it is a good idea to have a custom scheme here ... but at any rate it would probably be better to import net/url and use that for validation rather than to rely on an ad-hoc regex. I'll be happy to open a PR for either option (just correcting the Regex, or changing the implementation to use net/url.

theory commented 2 years ago

Patch:

--- a/go/integration/hcvault/hcvault_aead.go
+++ b/go/integration/hcvault/hcvault_aead.go
@@ -115,7 +115,7 @@ func (a *vaultAEAD) getDecryptionPath(keyURL string) (string, error) {
    return strings.Join(parts, "/"), nil
 }

-var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/*([a-zA-Z0-9.:]+)/(.*)$", vaultPrefix))
+var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/*([a-zA-Z0-9.:-]+)/(.*)$", vaultPrefix))

 func (a *vaultAEAD) extractKey(keyURL string) (string, error) {
    m := vaultKeyRegex.FindAllStringSubmatch(keyURL, -1)
theory commented 1 year ago

Took some time to implement three different options, including tests. Which one should we use? I'd be happy to submit a PR with the consensus choice.

import (
    "errors"
    "net/url"
    "regexp"
    "testing"
)

const (
    vaultScheme = "hcvault"
    vaultPrefix = vaultScheme + "://"
)

// extractKeyURI uses the URI package to extract the path.
func extractKeyURI(keyURL string) (string, error) {
    u, err := url.Parse(keyURL)
    if err != nil || u.Scheme != "hcvault" || len(u.Path) == 0 {
        return "", errors.New("malformed keyURL")
    }
    return u.EscapedPath()[1:], nil
}

// extractKeyRegex uses a comprehensive host regular expression to extract the
// path. Regex from https://www.regextester.com/23, as found in
// https://stackoverflow.com/a/56351327/79202.
var vaultURLRegex = regexp.MustCompile("^" + vaultPrefix + `(?:(?:(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9])\.)*(?:[A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]))?/(.*)$`)

func extractKeyRegex(keyURL string) (string, error) {
    m := vaultURLRegex.FindAllStringSubmatch(keyURL, -1)
    if m == nil {
        return "", errors.New("malformed keyURL")
    }
    return m[0][1], nil
}

// extractKeyDumbRegex uses the simplest possible host regex to extract the key,
// based on the assumption that the host name is validated elsewhere (e.g., when
// the client actually makes a call to the vault service).
var vaultKeyDumbRegex = regexp.MustCompile("^" + vaultPrefix + `[^/]*/(.*)$`)

func extractKeyDumbRegex(keyURL string) (string, error) {
    m := vaultKeyDumbRegex.FindAllStringSubmatch(keyURL, -1)
    if m == nil {
        return "", errors.New("malformed keyURL")
    }
    return m[0][1], nil
}

func TestExtractKey(t *testing.T) {
    for _, tc := range []struct {
        desc string
        uri  string
        key  string
        err  string
    }{
        {
            desc: "simple",
            uri:  "hcvault://vault.example.com/hi",
            key:  "hi",
        },
        {
            desc: "path",
            uri:  "hcvault://vault.example.com/foo/bar/baz",
            key:  "foo/bar/baz",
        },
        {
            desc: "hyphen host",
            uri:  "hcvault://vault-prd.example.com/coyote",
            key:  "coyote",
        },
        {
            desc: "empty string",
            uri:  "hcvault://example.com/",
            key:  "",
        },
        {
            desc: "escaped",
            uri:  "hcvault://vault.example.com/this%2Band+that",
            key:  "this%2Band+that",
        },
        {
            desc: "no host",
            uri:  "hcvault:///hi",
            key:  "hi",
        },
        {
            desc: "http",
            uri:  "http://vault.com/hi",
            err:  "malformed keyURL",
        },
        {
            desc: "no path",
            uri:  "hcvault://vault.com",
            err:  "malformed keyURL",
        },
        {
            desc: "slash only",
            uri:  "hcvault://vault.com/",
            key:  "",
        },
        {
            desc: "",
            uri:  "hcvault://vault.com/",
            key:  "",
        },
    } {
        for name, code := range map[string]func(keyURL string) (string, error){
            "url parse":  extractKeyURI,
            "regex":      extractKeyRegex,
            "dumb regex": extractKeyDumbRegex,
        } {
            t.Run(name+"/"+tc.desc, func(t *testing.T) {
                key, err := code(tc.uri)
                if err == nil {
                    if tc.err != "" {
                        t.Fatalf("Missing error, want=%s", tc.err)
                    }
                } else {
                    if tc.err != err.Error() {
                        t.Fatalf("Incorrect error, want=%s;got=%s", tc.err, err)
                    }
                }
                if key != tc.key {
                    t.Fatalf("Incorrect key, want=%s;got=%s", tc.key, key)
                }
            })
        }
    }
}
juergw commented 1 year ago

I think it would be best to use "net/url" instead of a regex. We already use net/url in hcvault_client.go anyways.

juergw commented 1 year ago

Could you update the pull request? Thanks.

theory commented 1 year ago

I submitted the regex check because a full URI parse seems redundant, since the URL will actually be used in an HTTP request and either be accepted by vault or not. You sure you want that instead? If so I will update the PR, yes.

chuckx commented 1 year ago

On top of being easier to read/maintain, using net/url seems to be a bit more performant. So we should definitely go with it.

$ go test -bench=.
goos: darwin
goarch: arm64
pkg: urlparsetest
BenchmarkExtractKeyURI-10            5756019           188.7 ns/op
BenchmarkExtractKeyRegex-10          1845735           649.7 ns/op
BenchmarkExtractKeyDumbRegex-10      2879150           417.1 ns/op
PASS
ok      urlparsetest    5.165s

This output is from adding the following benchmark functions alongside your code provided above:

const testURI = "hcvault://vault.example.com/hi"

func BenchmarkExtractKeyURI(b *testing.B) {
    for i := 0; i <= b.N; i++ {
        extractKeyURI(testURI)
    }
}

func BenchmarkExtractKeyRegex(b *testing.B) {
    for i := 0; i <= b.N; i++ {
        extractKeyRegex(testURI)
    }
}

func BenchmarkExtractKeyDumbRegex(b *testing.B) {
    for i := 0; i <= b.N; i++ {
        extractKeyDumbRegex(testURI)
    }
}
theory commented 1 year ago

Works for me, updated in 52515c2.

theory commented 1 year ago

PR submitted in #640.