smallstep / certificates

🛡️ A private certificate authority (X.509 & SSH) & ACME server for secure automated certificate management, so you can use TLS everywhere & SSO for SSH.
https://smallstep.com/certificates
Apache License 2.0
6.8k stars 445 forks source link

[Bug]: Step CA connects to IPs instead of KMS URL #2001

Open GBBx opened 2 months ago

GBBx commented 2 months ago

Steps to Reproduce

  1. start step-ca

    step-ca ...ca.json --password-file ...password
  2. get error message

    context deadline exceeded
    cloudKMS GetPublicKey failed
    go.step.sm/crypto/kms/cloudkms.(*Signer).preloadKey
        go.step.sm/crypto@v0.51.2/kms/cloudkms/signer.go:46
    go.step.sm/crypto/kms/cloudkms.NewSigner
        go.step.sm/crypto@v0.51.2/kms/cloudkms/signer.go:31
    go.step.sm/crypto/kms/cloudkms.(*CloudKMS).CreateSigner
        go.step.sm/crypto@v0.51.2/kms/cloudkms/cloudkms.go:162
    github.com/smallstep/certificates/authority.(*instrumentedKeyManager).CreateSigner
        github.com/smallstep/certificates/authority/meter.go:86
    github.com/smallstep/certificates/authority.(*Authority).init
        github.com/smallstep/certificates/authority/authority.go:426
    github.com/smallstep/certificates/authority.New
        github.com/smallstep/certificates/authority/authority.go:149
    github.com/smallstep/certificates/ca.(*CA).Init
        github.com/smallstep/certificates/ca/ca.go:200
    github.com/smallstep/certificates/ca.New
        github.com/smallstep/certificates/ca/ca.go:162
    github.com/smallstep/certificates/commands.appAction
        github.com/smallstep/certificates/commands/app.go:254
    github.com/urfave/cli.HandleAction
        github.com/urfave/cli@v1.22.15/app.go:524
    github.com/urfave/cli.Command.Run
        github.com/urfave/cli@v1.22.15/command.go:175
    main.main.func3
        ./main.go:202
    github.com/urfave/cli.HandleAction
        github.com/urfave/cli@v1.22.15/app.go:524
    github.com/urfave/cli.(*App).Run
        github.com/urfave/cli@v1.22.15/app.go:286
    main.main
        ./main.go:205
    runtime.main
        runtime/proc.go:272
    runtime.goexit
        runtime/asm_amd64.s:1700

Your Environment

Expected Behavior

step-ca should start

Actual Behavior

step-ca crashes

Additional Context

I use a corporate proxy. Step-ca runs as a systemd service and the HTTPS_PROXY variable is set in the unit file. On the proxy, the necessary KMS URLs are whitelisted, e.g. cloudkms.googleapis.com.

Since an update from 0.27.2 to 0.27.4 I see in the proxy logs that step-ca is trying to connect to IPs instead of the URL. I.e. if I do nslookup cloudkms.googleapis.com - these are the IPs I see in the proxy logs.

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

GBBx commented 2 months ago

My kms plugin was also updated from 0.11.4 to 0.11.5 at the same time as step-ca.

Edit: kms plugin 0.11.5 with step-ca 0.27.2 works.

hslatman commented 2 months ago

Hey @GBBx, thank you for opening the issue.

Do you have example logs of your proxy? Based on some quick triage our verdict was that we didn't change code related to handling HTTP with or without proxies directly, at least not knowingly. It's possible that a dependency upgrade in a lower level has changed the behavior, though.

I did a quick test with both the CA as well as the KMS plugin, but I didn't hit your problem. Note that the KMS plugin isn't used by the CA, but it does use shared code from our crypto package for communicating with GCP Cloud KMS, so they should behave fairly similar.

Is it possible your usage of the KMS plugin doesn't go through the proxy? Are you executing the CA and the KMS from the same machine, using the same user and the same GCP credentials?

GBBx commented 2 months ago

Hi @hslatman,

thanks for your questions.

Do you have example logs of your proxy?

Yes. Here's what I see on the proxy using step-ca 0.27.2 - a single line when step-ca is started:

1726603815.040 118996 X.X.X.X TCP_TUNNEL/200 5831 CONNECT cloudkms.googleapis.com:443 - HIER_DIRECT/216.58.212.138 -

And here's what I get with step-ca 0.27.4:

...
1726603992.985    390 X.X.X.X TCP_DENIED/403 3877 CONNECT 142.250.186.74:443 - HIER_NONE/- text/html
1726603993.149    162 X.X.X.X TCP_DENIED/403 3877 CONNECT 172.217.18.106:443 - HIER_NONE/- text/html
1726603993.232     82 X.X.X.X TCP_DENIED/403 3877 CONNECT 142.250.185.74:443 - HIER_NONE/- text/html
1726603993.272     39 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.185.106:443 - HIER_NONE/- text/html
1726603993.347     74 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.185.138:443 - HIER_NONE/- text/html
1726603993.393     44 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.185.170:443 - HIER_NONE/- text/html
1726603995.330   1936 X.X.X.X TCP_DENIED/403 3874 CONNECT 216.58.206.42:443 - HIER_NONE/- text/html
...

Is it possible your usage of the KMS plugin doesn't go through the proxy? Are you executing the CA and the KMS from the same machine, using the same user and the same GCP credentials?

It definitely goes through the proxy. I cannot connect to GCP if it is not through the proxy. I think I have a quite simple setup: step-ca and the kms plugin installed on the same machine, there's only one service account in use. I just followed https://smallstep.com/docs/step-ca/cryptographic-protection/#google-cloud-kms step by step and didn't add anything. Note, the same setup works if I downgrade to 0.27.2.

hslatman commented 2 months ago

Can you run the CA and KMS with GODEBUG=http2debug=2 set?

E.g. for KMS: GODEBUG=http2debug=2 step kms key --kms cloudkms <key>

GBBx commented 2 months ago

@hslatman

GODEBUG=http2debug=2 /usr/bin/step-ca /........ --password-file /.....
badger 2024/09/18 12:52:04 INFO: All 3 tables opened in 10ms
badger 2024/09/18 12:52:04 INFO: Replaying file id: 1 at offset: 336061984
badger 2024/09/18 12:52:04 INFO: Replay took: 332.121µs
cloudKMS GetPublicKey failed: context deadline exceeded

I guess the badger... lines are not relevant.

If I add STEPDEBUG=1 it says:

context deadline exceeded
cloudKMS GetPublicKey failed
go.step.sm/crypto/kms/cloudkms.(*Signer).preloadKey
        go.step.sm/crypto@v0.51.2/kms/cloudkms/signer.go:46
go.step.sm/crypto/kms/cloudkms.NewSigner
        go.step.sm/crypto@v0.51.2/kms/cloudkms/signer.go:31
go.step.sm/crypto/kms/cloudkms.(*CloudKMS).CreateSigner
        go.step.sm/crypto@v0.51.2/kms/cloudkms/cloudkms.go:162
github.com/smallstep/certificates/authority.(*instrumentedKeyManager).CreateSigner
        github.com/smallstep/certificates/authority/meter.go:86
github.com/smallstep/certificates/authority.(*Authority).init
        github.com/smallstep/certificates/authority/authority.go:426
github.com/smallstep/certificates/authority.New
        github.com/smallstep/certificates/authority/authority.go:149
github.com/smallstep/certificates/ca.(*CA).Init
        github.com/smallstep/certificates/ca/ca.go:200
github.com/smallstep/certificates/ca.New
        github.com/smallstep/certificates/ca/ca.go:162
github.com/smallstep/certificates/commands.appAction
        github.com/smallstep/certificates/commands/app.go:254
github.com/urfave/cli.HandleAction
        github.com/urfave/cli@v1.22.15/app.go:524
github.com/urfave/cli.Command.Run
        github.com/urfave/cli@v1.22.15/command.go:175
main.main.func3
        ./main.go:202
github.com/urfave/cli.HandleAction
        github.com/urfave/cli@v1.22.15/app.go:524
github.com/urfave/cli.(*App).Run
        github.com/urfave/cli@v1.22.15/app.go:286
main.main
        ./main.go:205
runtime.main
        runtime/proc.go:272
runtime.goexit
        runtime/asm_amd64.s:1700

The KMS command works:

GODEBUG=http2debug=2 step kms key --kms cloudkms <key>
2024/09/18 12:03:25 http2: Framer 0x.........: wrote SETTINGS len=0
2024/09/18 12:03:25 http2: Framer 0x.........: read SETTINGS len=18, settings: MAX_CONCURRENT_STREAMS=100, INITIAL_WINDOW_SIZE=1048576, MAX_HEADER_LIST_SIZE=65536
2024/09/18 12:03:25 http2: Framer 0x.........: read WINDOW_UPDATE len=4 (conn) incr=983041
2024/09/18 12:03:25 http2: Framer 0x.........: wrote SETTINGS flags=ACK len=0
2024/09/18 12:03:25 http2: Framer 0x.........: read SETTINGS flags=ACK len=0
2024/09/18 12:03:25 http2: Framer 0x.........: wrote HEADERS flags=END_HEADERS stream=1 len=1038
2024/09/18 12:03:25 http2: Framer 0x.........: wrote DATA flags=END_STREAM stream=1 len=158 data=".........................."
2024/09/18 12:03:26 http2: Framer 0x.........: read HEADERS flags=END_HEADERS stream=1 len=126
2024/09/18 12:03:26 http2: decoded hpack field header field ":status" = "200"
2024/09/18 12:03:26 http2: decoded hpack field header field "content-type" = "application/grpc"
2024/09/18 12:03:26 http2: decoded hpack field header field "grpc-accept-encoding" = "identity, deflate, gzip"
2024/09/18 12:03:26 http2: decoded hpack field header field "content-disposition" = "attachment"
2024/09/18 12:03:26 http2: decoded hpack field header field "date" = "Wed, 18 Sep 2024 10:03:25 GMT"
2024/09/18 12:03:26 http2: decoded hpack field header field "alt-svc" = "h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000"
2024/09/18 12:03:26 http2: Framer 0x.........: read DATA stream=1 len=351 data="\x........-----BEGIN PUBLIC KEY-----......-----END PUBLIC KEY-----........." (95 bytes omitted)
2024/09/18 12:03:26 http2: Framer 0x.........: read HEADERS flags=END_STREAM|END_HEADERS stream=1 len=113
2024/09/18 12:03:26 http2: decoded hpack field header field "grpc-status" = "0"
2024/09/18 12:03:26 http2: decoded hpack field header field "endpoint-load-metrics-bin" = "..........."
2024/09/18 12:03:26 http2: decoded hpack field header field "grpc-server-stats-bin" = "..........."
2024/09/18 12:03:26 http2: decoded hpack field header field "pc-high-bwd-bin" = "................"
2024/09/18 12:03:26 http2: Framer 0x.........: read PING len=8 ping="\x00\x00\x00\x00\x00\x00\x021"
2024/09/18 12:03:26 http2: Framer 0x.........: wrote WINDOW_UPDATE len=4 (conn) incr=351
2024/09/18 12:03:26 http2: Framer 0x.........: wrote PING len=8 ping="\x.........."
2024/09/18 12:03:26 http2: Framer 0x.........: wrote PING flags=ACK len=8 ping="\x........"
2024/09/18 12:03:26 http2: Framer 0x.........: wrote GOAWAY len=33 LastStreamID=1 ErrCode=NO_ERROR Debug="client transport shutdown"
-----BEGIN PUBLIC KEY-----
.............
.............
-----END PUBLIC KEY-----
GBBx commented 1 month ago

The issue persists with Step CA 0.27.5.

GBBx commented 3 weeks ago

This is still an issue on 0.28.0. @hslatman , anything else I could check?

GBBx commented 2 weeks ago

Hi, I realized I have another problem. I can sign certificates with kms if the step-kms-plugin is 0.11.4 or 0.11.5 but it fails if I update to 0.11.6 (and change nothing else).

Here's how I do it:

step certificate sign \
  --kms "cloudkms:" \
  --not-after 1h \
  ${CN}.csr \
  /etc/step-ca/certs/intermediate_ca.crt \
  "projects/${PROJECT}/locations/${LOCATION}/keyRings/${KEYRING}/cryptoKeys/${INTERMEDIATE_KEY}/cryptoKeyVersions/1" > ${CN}.crt

Error message:

Error: open projects/... cloudKMS GetPublicKey failed: context deadline exceeded

Proxy logs:

...
1730993133.287      0 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.186.138:443 - HIER_NONE/- text/html
1730993133.288      0 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.185.106:443 - HIER_NONE/- text/html
1730993133.288      0 X.X.X.X TCP_DENIED/403 3874 CONNECT 216.58.206.74:443 - HIER_NONE/- text/html
1730993133.289      0 X.X.X.X TCP_DENIED/403 3877 CONNECT 216.58.212.170:443 - HIER_NONE/- text/html
1730993133.289      0 X.X.X.X TCP_DENIED/403 3877 CONNECT 142.250.74.202:443 - HIER_NONE/- text/html
1730993133.290      0 X.X.X.X TCP_DENIED/403 3877 CONNECT 142.250.185.74:443 - HIER_NONE/- text/html
1730993133.291      0 X.X.X.X TCP_DENIED/403 3877 CONNECT 142.250.186.74:443 - HIER_NONE/- text/html
1730993133.291      0 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.185.138:443 - HIER_NONE/- text/html
1730993133.292      0 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.185.170:443 - HIER_NONE/- text/html
1730993133.292      0 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.185.202:443 - HIER_NONE/- text/html
1730993133.293      0 X.X.X.X TCP_DENIED/403 3880 CONNECT 142.250.185.234:443 - HIER_NONE/- text/html
1730993133.293      0 X.X.X.X TCP_DENIED/403 3927 CONNECT [2a00:1450:4001:828::200a]:443 - HIER_NONE/- text/html
1730993133.294      0 X.X.X.X TCP_DENIED/403 3927 CONNECT [2a00:1450:4001:810::200a]:443 - HIER_NONE/- text/html
1730993133.294      0 X.X.X.X TCP_DENIED/403 3927 CONNECT [2a00:1450:4001:811::200a]:443 - HIER_NONE/- text/html
1730993133.295      0 X.X.X.X TCP_DENIED/403 3927 CONNECT [2a00:1450:4001:80e::200a]:443 - HIER_NONE/- text/html
...

These addresses correspond to nslookup cloudkms.googleapis.com:

nslookup cloudkms.googleapis.com
Server:         ***
Address:        ***

Non-authoritative answer:
Name:   cloudkms.googleapis.com
Address: 216.58.206.42
Name:   cloudkms.googleapis.com
Address: 142.250.185.106
Name:   cloudkms.googleapis.com
Address: 142.250.185.170
Name:   cloudkms.googleapis.com
Address: 142.250.185.202
Name:   cloudkms.googleapis.com
Address: 142.250.185.234
Name:   cloudkms.googleapis.com
Address: 142.250.186.170
Name:   cloudkms.googleapis.com
Address: 172.217.18.106
Name:   cloudkms.googleapis.com
Address: 142.250.181.234
Name:   cloudkms.googleapis.com
Address: 142.250.186.42
Name:   cloudkms.googleapis.com
Address: 142.250.184.202
Name:   cloudkms.googleapis.com
Address: 142.250.186.138
Name:   cloudkms.googleapis.com
Address: 216.58.212.138
Name:   cloudkms.googleapis.com
Address: 172.217.18.10
Name:   cloudkms.googleapis.com
Address: 142.250.186.106
Name:   cloudkms.googleapis.com
Address: 172.217.16.202
Name:   cloudkms.googleapis.com
Address: 142.250.184.234
Name:   cloudkms.googleapis.com
Address: 2a00:1450:4001:810::200a
Name:   cloudkms.googleapis.com
Address: 2a00:1450:4001:811::200a
Name:   cloudkms.googleapis.com
Address: 2a00:1450:4001:812::200a
Name:   cloudkms.googleapis.com
Address: 2a00:1450:4001:813::200a

Maybe a shared dependency was updated in step-ca between 0.27.2 and 0.27.4 and in step-kms-plugin between 0.11.5 and 0.11.6?