gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.33k stars 1.74k forks source link

Teleport can exit on startup while DynamoDB tables are being created #6356

Open webvictim opened 3 years ago

webvictim commented 3 years ago

Description

What happened: When Teleport is configured to use DynamoDB for storage and creates the tables itself, it can exit with an error while this process is happening:

~ » kubectl logs deploy/teleport -f
INFO             Using license from /var/lib/license/license.pem expires at 2121-02-26 20:47:46.730444928 +0000 UTC,supports kubernetes,supports application access,supports database access. process/process.go:64
DEBU [SQLITE]    Connected to: file:/var/lib/teleport/proc/sqlite.db?_busy_timeout=10000&_sync=OFF, poll stream period: 1s lite/lite.go:172
DEBU [SQLITE]    Synchronous: 0, busy timeout: 10000 lite/lite.go:217
DEBU [PROC:1]    Adding service to supervisor. service:readyz.monitor service/supervisor.go:184
INFO [PROC:1]    Service diag is creating new listener on 0.0.0.0:3000. service/signals.go:213
INFO [DIAG:1]    Starting diagnostic service on 0.0.0.0:3000. service/service.go:2049
DEBU [PROC:1]    Adding service to supervisor. service:diagnostic.service service/supervisor.go:184
DEBU [PROC:1]    Adding service to supervisor. service:diagnostic.shutdown service/supervisor.go:184
DEBU [KEYGEN]    SSH cert authority is going to pre-compute 25 keys. native/native.go:99
DEBU [PROC:1]    Using dynamodb backend. service/service.go:3087
INFO [DYNAMODB]  Initializing backend. Table: "gus-helm-teleport-aws-backend", poll streams every 0s. dynamo/dynamodbbk.go:207
DEBU [DYNAMODB]  AWS session is created. dynamo/dynamodbbk.go:322
INFO [S3]        Setting up bucket "gus-helm-teleport-aws-bucket", sessions path "" in region "us-east-1". s3sessions/s3handler.go:142
DEBU [DYNAMODB]  Found latest event stream arn:aws:dynamodb:us-east-1:278576220453:table/gus-helm-teleport-aws-backend/stream/2021-04-08T14:21:00.744. dynamo/shards.go:80
INFO [S3]        Setup bucket "gus-helm-teleport-aws-bucket" completed. duration:39.813108ms s3sessions/s3handler.go:146
INFO [DYNAMODB]  Initializing event backend. dynamoevents/dynamoevents.go:180

ERROR REPORT:
Original Error: *awserr.requestError ValidationException: Cannot describe time to live while table is in CREATING state: Current table state is CREATING
    status code: 400, request id: C5LTMG3SKAGOPHT7CMRULVE15RVV4KQNSO5AEMVJF66Q9ASUAAJG
Stack Trace:
    /go/src/github.com/gravitational/teleport/lib/events/dynamoevents/dynamoevents.go:586 github.com/gravitational/teleport/lib/events/dynamoevents.(*Log).turnOnTimeToLive
    /go/src/github.com/gravitational/teleport/lib/events/dynamoevents/dynamoevents.go:229 github.com/gravitational/teleport/lib/events/dynamoevents.New
    /go/src/github.com/gravitational/teleport/lib/service/service.go:963 github.com/gravitational/teleport/lib/service.initExternalLog
    /go/src/github.com/gravitational/teleport/lib/service/service.go:1068 github.com/gravitational/teleport/lib/service.(*TeleportProcess).initAuthService
    /go/src/github.com/gravitational/teleport/lib/service/service.go:703 github.com/gravitational/teleport/lib/service.NewTeleport
    /go/src/github.com/gravitational/teleport/e/tool/teleport/process/process.go:67 github.com/gravitational/teleport/e/tool/teleport/process.NewTeleport
    /go/src/github.com/gravitational/teleport/lib/service/service.go:446 github.com/gravitational/teleport/lib/service.Run
    /go/src/github.com/gravitational/teleport/e/tool/teleport/main.go:19 main.main
    /opt/go/src/runtime/proc.go:204 runtime.main
    /opt/go/src/runtime/asm_amd64.s:1374 runtime.goexit
User Message: initialization failed
    ValidationException: Cannot describe time to live while table is in CREATING state: Current table state is CREATING
    status code: 400, request id: C5LTMG3SKAGOPHT7CMRULVE15RVV4KQNSO5AEMVJF66Q9ASUAAJG

What you expected to happen: The process should wait for the table to be created before attempting any operations on it. If there is an error, it should not cause the Teleport process to exit.

Reproduction Steps

Deploy Teleport into AWS using a config similar to this where the DynamoDB tables/S3 bucket are not created ahead of time.

teleport:
  log:
    severity: DEBUG
    output: stderr
  storage:
    type: dynamodb
    region: us-east-1
    table_name: gus-helm-teleport-aws-backend
    audit_events_uri: ['dynamodb://gus-helm-teleport-aws-events']
    audit_sessions_uri: s3://gus-helm-teleport-aws-bucket
auth_service:
  enabled: true
  cluster_name: eks.teleportdemo.net
  license_file: '/var/lib/license/license.pem'
kubernetes_service:
  enabled: true
  listen_addr: 0.0.0.0:3027
  labels:
    env: aws
proxy_service:
  public_addr: 'eks.teleportdemo.net:443'
  kube_listen_addr: 0.0.0.0:3026
  enabled: true
ssh_service:
  enabled: false

Server Details

evanfreed commented 1 year ago

I don't have the log anymore but I can confirm I saw this on 11.x versions. Fortunately I was running this in Kubernetes so the pod restarted and continued on. However, not ideal either way.

evanfreed commented 1 year ago

I didn't have debug logs on but can confirm on Teleport Enterprise v11.2.3 git:api/v11.2.3-0-g73240bcab9 go1.19.5

ERROR: initialization failed
ValidationException: Cannot describe time to live while table is in CREATING state: Current table state is CREATING
"\tstatus code: 400, request id: XXXX"