Teleport 6.2 Test Plan - Githubissues

russjones commented 3 years ago

Manual Testing Plan

Below are the items that should be manually tested with each release of Teleport. These tests should be run on both a fresh install of the version to be released as well as an upgrade of the previous version of Teleport.

[x] Adding nodes to a cluster @webvictim @tcsc
- [x] Adding Nodes via Valid Static Token
- [x] Adding Nodes via Valid Short-lived Tokens
- [x] Adding Nodes via Invalid Token Fails
- [x] Adding Nodes via Expired Token Fails
- [x] Adding Nodes with No Token Fails
- [x] Adding Nodes with Invalid Roles Fails
- [x] Revoking Node Invitation
[x] Trusted Clusters @nklaassen @awly
- [x] Adding Trusted Cluster Valid Static Token
- [x] Adding Trusted Cluster Valid Short-lived Token
- [x] Adding Trusted Cluster Invalid Token
- [x] Removing Trusted Cluster
[x] RBAC @Joerger @andrejtokarcik

Make sure that invalid and valid attempts are reflected in audit log.
- [x] Successfully connect to node with correct role
- [x] Unsuccessfully connect to a node in a role restricting access by label
- [x] Unsuccessfully connect to a node in a role restricting access by invalid SSH login
- [x] Allow/deny role option: SSH agent forwarding
- [x] Allow/deny role option: Port forwarding
[x] Users @fspmarshall @quinqu With every user combination, try to login and signup with invalid second factor, invalid password to see how the system reacts.
- [x] Adding Users Password Only
- [x] Adding Users OTP
- [x] Adding Users U2F
- [x] Managing MFA devices
- [x] Add an OTP device with tsh mfa add
- [x] Add a U2F device with tsh mfa add
- [x] List MFA devices with tsh mfa ls
- [x] Remove an OTP device with tsh mfa rm
- [x] Remove a U2F device with tsh mfa rm
- [x] Attempt removing the last MFA device on the user
  - [x] with second_factor: on in auth_service, should fail
  - [x] with second_factor: optional in auth_service, should succeed
- [x] Login Password Only
- [x] Login with MFA
- [x] Add 2 OTP and 2 U2F devices with tsh mfa add
- [x] Login via OTP
- [x] Login via U2F
- [x] Login OIDC
- [x] Login SAML
- [x] Login GitHub
- [x] Deleting Users
[x] Audit Log @r0mant @xacrimon
- [x] Failed login attempts are recorded
- [x] Interactive sessions have the correct Server ID
- [x] Server ID is the ID of the node in regular mode
- [x] Server ID is randomly generated for proxy node
- [x] Exec commands are recorded
- [x] scp commands are recorded
- [x] Subsystem results are recorded
[x] Interact with a cluster using tsh @webvictim @tcsc

These commands should ideally be tested for recording and non-recording modes as they are implemented in a different ways.
- [x] tsh ssh \<regular-node>
- [x] tsh ssh \<node-remote-cluster>
- [x] tsh ssh -A \<regular-node>
- [x] tsh ssh -A \<node-remote-cluster>
- [x] tsh ssh \<regular-node> ls
- [x] tsh ssh \<node-remote-cluster> ls
- [x] tsh join \<regular-node>
- [x] tsh join \<node-remote-cluster>
- [x] tsh play \<regular-node>
- [x] tsh play \<node-remote-cluster>
- [x] tsh scp \<regular-node>
- [x] tsh scp \<node-remote-cluster>
- [x] tsh ssh -L \<regular-node>
- [x] tsh ssh -L \<node-remote-cluster>
- [x] tsh ls
- [x] tsh clusters
[x] Interact with a cluster using ssh @nklaassen @awly Make sure to test both recording and regular proxy modes.
- [x] ssh \<regular-node>
- [x] ssh \<node-remote-cluster>
- [x] ssh -A \<regular-node>
- [x] ssh -A \<node-remote-cluster>
- [x] ssh \<regular-node> ls
- [x] ssh \<node-remote-cluster> ls
- [x] scp \<regular-node>
- [x] scp \<node-remote-cluster>
- [x] ssh -L \<regular-node>
- [x] ssh -L \<node-remote-cluster>
[x] Interact with a cluster using the Web UI @Joerger @andrejtokarcik
- [x] Connect to a Teleport node
- [x] Connect to a OpenSSH node
- [x] Check agent forwarding is correct based on role and proxy mode.

Combinations @fspmarshall @quinqu

For some manual testing, many combinations need to be tested. For example, for interactive sessions the 12 combinations are below.

[x] Connect to a OpenSSH node in a local cluster using OpenSSH.
[x] Connect to a OpenSSH node in a local cluster using Teleport.
[x] Connect to a OpenSSH node in a local cluster using the Web UI.
[x] Connect to a Teleport node in a local cluster using OpenSSH.
[x] Connect to a Teleport node in a local cluster using Teleport.
[x] Connect to a Teleport node in a local cluster using the Web UI.
[x] Connect to a OpenSSH node in a remote cluster using OpenSSH.
[x] Connect to a OpenSSH node in a remote cluster using Teleport.
[x] Connect to a OpenSSH node in a remote cluster using the Web UI.
[x] Connect to a Teleport node in a remote cluster using OpenSSH.
[x] Connect to a Teleport node in a remote cluster using Teleport.
[x] Connect to a Teleport node in a remote cluster using the Web UI.

Teleport with multiple Kubernetes clusters @xacrimon @webvictim

Note: you can use GKE or EKS or minikube to run Kubernetes clusters. Minikube is the only caveat - it's not reachable publicly so don't run a proxy there.

[x] Deploy combo auth/proxy/kubernetes_service outside of a Kubernetes cluster, using a kubeconfig
- [x] Login with tsh login, check that tsh kube ls has your cluster
- [x] Run kubectl get nodes, kubectl exec -it $SOME_POD -- sh
- [x] Verify that the audit log recorded the above request and session
[ ] Deploy combo auth/proxy/kubernetes_service inside of a Kubernetes cluster
- [x] Login with tsh login, check that tsh kube ls has your cluster
- [x] Run kubectl get nodes, kubectl exec -it $SOME_POD -- sh
- [ ] Verify that the audit log recorded the above request and session
[ ] Deploy combo auth/proxy_service outside of the Kubernetes cluster and kubernetes_service inside of a Kubernetes cluster, connected over a reverse tunnel
- [x] Login with tsh login, check that tsh kube ls has your cluster
- [x] Run kubectl get nodes, kubectl exec -it $SOME_POD -- sh
- [ ] Verify that the audit log recorded the above request and session
[ ] Deploy a second kubernetes_service inside of another Kubernetes cluster, connected over a reverse tunnel
- [x] Login with tsh login, check that tsh kube ls has both clusters
- [x] Switch to a second cluster using tsh kube login
- [x] Run kubectl get nodes, kubectl exec -it $SOME_POD -- sh on the new cluster
- [ ] Verify that the audit log recorded the above request and session
[x] Deploy combo auth/proxy/kubernetes_service outside of a Kubernetes cluster, using a kubeconfig with multiple clusters in it
- [x] Login with tsh login, check that tsh kube ls has all clusters
[x] Test Kubernetes screen in the web UI (tab is located on left side nav on dashboard):
- [x] Verify that all kubes registered are shown with correct name and labels
- [x] Verify that clicking on a rows connect button renders a dialogue on manual instructions with Step 2 login value matching the rows name column
- [x] Verify searching for name or labels in the search bar works
- [x] Verify you can sort by name colum

Helm charts

[ ] Deploy teleport-cluster Helm chart to an EKS cluster in HA mode by following the AWS guide
- [ ] Verify that web UI works with no TLS warnings and you can create a user with tctl users add
- [ ] Log in with tsh login
- [ ] Display Kubernetes clusters with tsh kube ls, log in with tsh kube login
- [ ] Run kubectl get nodes and kubectl -n kube-system get pods
[ ] Deploy teleport-cluster Helm chart to a GKE cluster in HA mode by following the GKE guide
- [ ] Verify that web UI works with no TLS warnings and you can create a user with tctl users add
- [ ] Log in with tsh login
- [ ] Display Kubernetes clusters with tsh kube ls, log in with tsh kube login
- [ ] Run kubectl get nodes and kubectl -n kube-system get pods
[ ] Deploy teleport-kube-agent Helm chart to an EKS cluster following instructions in the README
- [ ] Verify that the remote Kubernetes cluster appears in tsh kube ls, log in with tsh kube login
- [ ] Run kubectl get nodes and kubectl get pods, verify no errors
[ ] Deploy teleport-kube-agent Helm chart to a GKE cluster following instructions in the README
- [ ] Verify that the remote Kubernetes cluster appears in tsh kube ls, log in with tsh kube login
- [ ] Run kubectl get nodes and kubectl get pods, verify no errors

Migrations @tcsc @nklaassen

[x] Migrate trusted clusters from 6.1.0 to 6.2.0
- [x] Migrate auth server on main cluster, then rest of the servers on main cluster SSH should work for both main and old clusters
- [x] Migrate auth server on remote cluster, then rest of the remote cluster SSH should work

Command Templates

When interacting with a cluster, the following command templates are useful:

OpenSSH

# when connecting to the recording proxy, `-o 'ForwardAgent yes'` is required.
ssh -o "ProxyCommand ssh -o 'ForwardAgent yes' -p 3023 %r@proxy.example.com -s proxy:%h:%p" \
  node.example.com

# the above command only forwards the agent to the proxy, to forward the agent
# to the target node, `-o 'ForwardAgent yes'` needs to be passed twice.
ssh -o "ForwardAgent yes" \
  -o "ProxyCommand ssh -o 'ForwardAgent yes' -p 3023 %r@proxy.example.com -s proxy:%h:%p" \
  node.example.com

# when connecting to a remote cluster using OpenSSH, the subsystem request is
# updated with the name of the remote cluster.
ssh -o "ProxyCommand ssh -o 'ForwardAgent yes' -p 3023 %r@proxy.example.com -s proxy:%h:%p@foo.com" \
  node.foo.com

Teleport

# when connecting to a OpenSSH node, remember `-p 22` needs to be passed.
tsh --proxy=proxy.example.com --user=<username> --insecure ssh -p 22 node.example.com

# an agent can be forwarded to the target node with `-A`
tsh --proxy=proxy.example.com --user=<username> --insecure ssh -A -p 22 node.example.com

# the --cluster flag is used to connect to a node in a remote cluster.
tsh --proxy=proxy.example.com --user=<username> --insecure ssh --cluster=foo.com -p 22 node.foo.com

Teleport Plugins @awly @Joerger

[x] Test receiving a message via Teleport Slackbot
[x] Test receiving a new Jira Ticket via Teleport Jira

WEB UI @kimlisa @alex-kovoy

Main

For main, test with admin role that has access to all resources.

Top Nav

[x] Verify that cluster selector displays all (root + leaf) clusters
[x] Verify that user name is displayed
[x] Verify that user menu shows logout, help&support, and account settings (for local users)

Side Nav

[x] Verify that each item has an icon
[x] Verify that Collapse/Expand works and collapsed has icon >, and expand has icon v
[x] Verify that it automatically expands and highlights the item on page refresh

Servers aka Nodes

[x] Verify that "Servers" table shows all joined nodes
[x] Verify that "Connect" button shows a list of available logins
[x] Verify that "Hostname", "Address" and "Labels" columns show the current values
[x] Verify that "Search" by hostname, address, labels works
[x] Verify that terminal opens when clicking on one of the available logins
[x] Verify that clicking on Add Server button renders dialogue set to Automatically view
- [x] Verify clicking on Regenerate Script regenerates token value in the bash command
- [x] Verify using the bash command successfully adds the server (refresh server list)
- [x] Verify that clicking on Manually tab renders manual steps
- [x] Verify that clicking back to Automatically tab renders bash command

Applications

[x] Verify that clicking on Add Application button renders dialogue
- [x] Verify input validation (prevent empty value and invalid url)
- [x] Verify after input and clicking on Generate Script, bash command is rendered
- [x] Verify clicking on Regenerate button regenerates token value in bash command

Databases

[x] Verify that clicking on Add Database button renders dialogue for manual instructions:
- [x] Verify selecting different options on Step 4 changes Step 5 commands

Active Sessions

[x] Verify that "empty" state is handled
[x] Verify that it displays the session when session is active
[x] Verify that "Description", "Session ID", "Users", "Nodes" and "Duration" columns show correct values
[x] Verify that "OPTIONS" button allows to join a session

Audit log

[x] Verify that time range button is shown and works
[x] Verify that clicking on Session Ended event icon, takes user to session player
[x] Verify event detail dialogue renders when clicking on events details button
[x] Verify searching by type, description, created works

Users

[x] Verify that users are shown
[x] Verify that creating a new user works
[x] Verify that editing user roles works
[x] Verify that removing a user works
[x] Verify resetting a user's password works
[x] Verify search by username, roles, and type works

Auth Connectors

[x] Verify that creating OIDC/SAML/GITHUB connectors works
[x] Verify that editing OIDC/SAML/GITHUB connectors works
[x] Verify that error is shown when saving an invalid YAML
[x] Verify that correct hint text is shown on the right side
[x] Verify that encrypted SAML assertions work with an identity provider that supports it (Azure).

Auth Connectors Card Icons

[x] Verify that GITHUB card has github icon
[x] Verify that SAML card has SAML icon
[x] Verify that OIDC card has OIDC icon
[x] Verify when there are no connectors, empty state renders

Roles

[x] Verify that roles are shown
[x] Verify that "Create New Role" dialog works
[x] Verify that deleting and editing works
[x] Verify that error is shown when saving an invalid YAML
[x] Verify that correct hint text is shown on the right side

Managed Clusters

[x] Verify that it displays a list of clusters (root + leaf)
[x] Verify that every menu item works: nodes, apps, audit events, session recordings.

Help & Support

[x] Verify that all URLs work and correct (no 404)

Access Requests

Creating Access Rquests

Create a role with limited permissions (defined below as allow-roles). This role allows you to see the Role screen and ssh into all nodes.
Create another role with limited permissions (defined below as allow-users). This role session expires in 4 minutes, allows you to see Users screen, and denies access to all nodes.
Create another role with no permissions other than being able to create requests (defined below as default)
Create a user with role default assigned

Create a few requests under this user to test pending/approved/denied state.

kind: role
metadata:
name: allow-roles
spec:
allow:
logins:
- root
node_labels:
  '*': '*'
rules:
- resources:
  - role
  verbs:
  - list
  - read
options:
max_session_ttl: 8h0m0s
version: v3

kind: role
metadata:
name: allow-users
spec:
allow:
rules:
- resources:
  - user
  verbs:
  - list
  - read
deny:
node_labels:
  '*': '*'
options:
max_session_ttl: 4m0s
version: v3

kind: role
metadata:
name: default
spec:
allow:
request:
  roles:
  - allow-roles
  - allow-users
  suggested_reviewers:
  - random-user-1
  - random-user-2
options:
max_session_ttl: 8h0m0s
version: v3

[x] Verify that creating a new request works
[x] Verify that under requestable roles, only allow-roles and allow-users are listed
[x] Verify input validation requires at least one role to be selected
[x] Verify you can select/input/modify reviewers
[x] Verify after creating, requests are listed in pending states
[x] Verify you can't review own requests

Viewing & Approving/Denying Requests

Create a user with the role reviewer that allows you to review all requests, and delete them.

kind: role
version: v3
metadata:
  name: reviewer
spec:
  allow:
    review_requests:
      roles: ['*']

[x] Verify you can view access request from request list
[x] Verify there is list of reviewers you selected (empty list if none selected AND none wasn't defined in roles)
[x] Verify threshold name is there (it will be default if thresholds weren't defined in role, or blank if not named)
[x] Verify you can approve a request with message, and immediately see updated state with your review stamp (green checkmark) and message box
[x] Verify you can deny a request, and immediately see updated state with your review stamp (red cross)
[x] Verify deleting the denied request is removed from list

Assuming Approved Requests

[x] Verify assume buttons are only present for approved request and for logged in user
[x] Verify that assuming allow-roles allows you to see roles screen and ssh into nodes
[x] Verify that after clicking on the assume button, it is disabled in both the list and in viewing
[x] After assuming allow-roles, verify that assuming allow-users allows you to see users screen, and denies access to nodes
- [x] Verify a switchback banner is rendered with roles assumed, and count down of when it expires
- [x] Verify switching back goes back to your default static role
- [x] Verify after re-assuming this role, the user is automatically logged out after the expiry is met (4 minutes)
[x] Verify that after logging out (or getting logged out automatically) and relogging in, permissions are reset to default, and requests that are not expired and are approved are assumable again

Access Request Waiting Room

Strategy Reason

Create the following role:

kind: role
metadata:
  name: restrict
spec:
  allow:
    request:
      roles:
      - <some other role to assign user after approval>
  options:
    max_session_ttl: 8h0m0s
    request_access: reason
    request_prompt: <some custom prompt to show in reason dialogue>
version: v3

[x] Verify after login, reason dialogue is rendered with prompt set to request_prompt setting
[x] Verify after clicking send request, pending dialogue renders
[x] Verify after approving a request, dashboard is rendered
[x] Verify the correct role was assigned

Strategy Always

With the previous role you created from Strategy Reason, change request_access to always:

[x] Verify after login, pending dialogue is rendered
[x] Verify after approving a request, dashboard is rendered
[x] Verify after denying a request, access denied dialogue is rendered

Strategy Optional

With the previous role you created from Strategy Reason, change request_access to optional:

[x] Verify after login, dashboard is rendered
[x] Verify a switchback banner is rendered with roles assumed, and count down of when it expires
- [x] Verify switchback button says Switch Back and clicking goes back to the login screen

Account

[x] Verify that Account screen is accessibly from the user menu for local users.
[x] Verify that changing a local password works (OTP, U2F)

Terminal

[x] Verify that top nav has a user menu (Main and Logout)
[x] Verify that switching between tabs works on alt+[1...9]

Node List Tab

[x] Verify that Cluster selector works (URL should change too)
[x] Verify that Quick launcher input works
[x] Verify that Quick launcher input handles input errors
[x] Verify that "Connect" button shows a list of available logins
[x] Verify that "Hostname", "Address" and "Labels" columns show the current values
[x] Verify that "Search" by hostname, address, labels work
[x] Verify that new tab is created when starting a session

Session Tab

[x] Verify that session and browser tabs both show the title with login and node name
[x] Verify that terminal resize works
- Install midnight commander on the node you ssh into: $ sudo apt-get install mc
- Run the program: $ mc
- Resize the terminal to see if panels resize with it
[x] Verify that session tab shows/updates number of participants when a new user joins the session
[x] Verify that tab automatically closes on "$ exit" command
[ ] Verify that SCP Upload works
[x] Verify that SCP Upload handles invalid paths and network errors
[ ] Verify that SCP Download works
[x] Verify that SCP Download handles invalid paths and network errors

Session Player

[x] Verify that it can replay a session
[x] Verify that when playing, scroller auto scrolls to bottom most content
[x] Verify when resizing player to a small screen, scroller appears and is working
[x] Verify that error message is displayed (enter a invalid SID in the URL)

Invite Form

[x] Verify that input validates
[x] Verify that invite works with 2FA disabled
[x] Verify that invite works with OTP enabled
[x] Verify that invite works with U2F enabled
[x] Verify that error message is shown if an invite is expired/invalid

Login Form

[x] Verify that input validates
[x] Verify that login works with 2FA disabled
[x] Verify that login works with OTP enabled
[x] Verify that login works with U2F enabled
[x] Verify that login works for Github/SAML/OIDC
[x] Verify that account is locked after several unsuccessful attempts
[x] Verify that redirect to original URL works after successful login

Multi-factor Authentication (mfa)

Create/modify teleport.yaml and set the following authentication settings under auth_service

authentication:
  type: local
  second_factor: optional
  require_session_mfa: yes
  u2f:
    app_id: https://example.com:443
    facets:
    - https://example.com:443
    - https://example.com
    - example.com:443
    - example.com

MFA create, login, password reset

[x] Verify when creating a user, and setting password, required 2nd factor is totp (TODO: temporary hack, ideally want to allow user to select)
[x] Verify at login page, there is a mfa dropdown menu (none, u2f, otp), and can login with otp
[x] Verify at reset password page, there is the same dropdown to select your mfa, and can reset with otp

MFA require auth

Through the CLI, tsh login and register a u2f key with tsh mfa add (not supported in UI yet).

Using the same user as above:

[x] Verify logging in with registered u2f key works
[x] Verify connecting to a ssh node prompts you to tap your registered u2f key

RBAC

Create a role, with no allow.rules defined:

kind: role
metadata:
  name: test
spec:
  allow:
    app_labels:
      '*': '*'
    logins:
    - root
    node_labels:
      '*': '*'
  options:
    max_session_ttl: 8h0m0s
version: v3

[x] Verify that a user has access only to: "Servers", "Applications", "Databases", "Kubernetes", "Active Sessions", "Access Requests" and "Manage Clusters"
[x] Verify there is no Add Server button in Server view
[x] Verify there is no Add Application button in Applications view
[x] Verify only Nodes and Apps are listed under options button in Manage Clusters

Note: User has read/create access_request access to their own requests, despite resource settings

Add the following under spec.allow.rules to enable read access to the audit log:

  - resources:
      - event
      verbs:
      - list

[x] Verify that the Audit Log and Session Recordings is accessible
[x] Verify that playing a recorded session is denied

Add the following to enable read access to recorded sessions

  - resources:
      - session
      verbs:
      - read

[x] Verify that a user can re-play a session (session.end)

Add the following to enable read access to the roles

- resources:
      - role
      verbs:
      - list
      - read

[x] Verify that a user can see the roles
[x] Verify that a user cannot reset password and create/delete/update a role

Add the following to enable read access to the auth connectors

- resources:
      - auth_connector
      verbs:
      - list
      - read

[x] Verify that a user can see the list of auth connectors.
[x] Verify that a user cannot create/delete/update the connectors

Add the following to enable read access to users

  - resources:
      - user
      verbs:
      - list
      - read

[x] Verify that a user can access the "Users" screen
[x] Verify that a user cannot create/delete/update a user

Add the following to enable read access to trusted clusters

  - resources:
      - trusted_cluster
      verbs:
      - list
      - read

[x] Verify that a user can access the "Trust" screen
[x] Verify that a user cannot create/delete/update a trusted cluster.

Performance/Soak Test @xacrimon @fspmarshall

Using tsh bench tool, perform the soak tests and benchmark tests on the following configurations:

Cluster with 10K nodes in normal (non-IOT) node mode with ETCD
Cluster with 10K nodes in normal (non-IOT) mode with DynamoDB
Cluster with 1K IOT nodes with ETCD
Cluster with 1K IOT nodes with DynamoDB
Cluster with 500 trusted clusters with ETCD
Cluster with 500 trusted clusters with DynamoDB

Soak Tests

Run 4hour soak test with a mix of interactive/non-interactive sessions:

tsh bench --duration=4h user@teleport-monster-6757d7b487-x226b ls
tsh bench -i --duration=4h user@teleport-monster-6757d7b487-x226b ps uax

Observe prometheus metrics for goroutines, open files, RAM, CPU, Timers and make sure there are no leaks

[ ] Verify that prometheus metrics are accurate.

Breaking load tests

Load system with tsh bench to the capacity and publish maximum numbers of concurrent sessions with interactive and non interactive tsh bench loads.

Application Access @r0mant @smallinsky

[x] Run an application within local cluster.
- [x] Verify the debug application debug_app: true works.
- [x] Verify an application can be configured with command line flags.
- [x] Verify an application can be configured from file configuration.
- [x] Verify that applications are available at auto-generated addresses name.rootProxyPublicAddr and well as publicAddr.
[x] Run an application within a trusted cluster.
- [x] Verify that applications are available at auto-generated addresses name.rootProxyPublicAddr.
[x] Verify Audit Records.
- [x] app.session.start and app.session.chunk events are created in the Audit Log.
- [x] app.session.chunk points to a 5 minute session archive with multiple app.session.request events inside.
- [x] tsh play <chunk-id> can fetch and print a session chunk archive.
[x] Verify JWT using verify-jwt.go.
[x] Verify RBAC.
[x] Verify CLI access with tsh app login.
[x] Test Applications screen in the web UI (tab is located on left side nav on dashboard):
- [x] Verify that all apps registered are shown
- [x] Verify that clicking on the app icon takes you to another tab
- [x] Verify using the bash command produced from Add Application dialogue works (refresh app screen to see it registered)

Database Access @r0mant @smallinsky

[x] Connect to a database within a local cluster.
- [x] Self-hosted Postgres.
- [x] Self-hosted MySQL.
- [x] AWS Aurora Postgres.
- [x] AWS Aurora MySQL.
- [x] AWS Redshift.
- [x] GCP Cloud SQL Postgres.
[x] Connect to a database within a remote cluster via a trusted cluster.
- [x] Self-hosted Postgres.
- [x] Self-hosted MySQL.
- [x] AWS Aurora Postgres.
- [x] AWS Aurora MySQL.
- [x] AWS Redshift.
- [x] GCP Cloud SQL Postgres.
[x] Verify audit events.
- [x] db.session.start is emitted when you connect.
- [x] db.session.end is emitted when you disconnect.
- [x] db.session.query is emitted when you execute a SQL query.
[x] Verify RBAC.
- [x] tsh db ls shows only databases matching role's db_labels.
- [x] Can only connect as users from db_users.
- [x] (Postgres only) Can only connect to databases from db_names.
- [x] db.session.start is emitted when connection attempt is denied.
[x] Test Databases screen in the web UI (tab is located on left side nav on dashboard):
- [x] Verify that all dbs registered are shown with correct name, description, type, and labels
- [x] Verify that clicking on a rows connect button renders a dialogue on manual instructions with Step 2 login value matching the rows name column
- [x] Verify searching for all columns in the search bar works
- [x] Verify you can sort by all columns except labels

quinqu commented 3 years ago

When adding an OTP device with tsh mfa add and try to enter the code, teleport says the code must be 6 digits long and my input surely is. It still wont be accepted. Terminal output:

Choose device type [TOTP, U2F]: TOTP
Enter device name: tempdevice
Enter an OTP code from a *registered* device: 628304

Open your TOTP app and create a new manual entry with these fields:
  URL: <omitted> 
  Account name: <omitted>
  Secret key: <omitted>
  Issuer: <omitted> 
  Algorithm: SHA1
  Number of digits: 6
  Period: 30s

Once created, enter an OTP code generated by the app: 624072
TOTP code must be exactly 6 digits long, try again
Once created, enter an OTP code generated by the app: 624072
TOTP code must be exactly 6 digits long, try again
Once created, enter an OTP code generated by the app: 910046
TOTP code must be exactly 6 digits long, try again
Once created, enter an OTP code generated by the app: 426970
TOTP code must be exactly 6 digits long, try again
Once created, enter an OTP code generated by the app:

awly commented 3 years ago

@quinqu could you please file a bug for this and assign to me? It's likely I introduced the problem in 6.2

Joerger commented 3 years ago

Updating a user with tctl create -f user.yaml breaks the audit log and session recordings tabs in the Web UI - #6935

tcsc commented 3 years ago

@webvictim - I've added a test matrix for the tsh tests here so we don't stomp on each other. Or on ourselves. Feel free to edit as necessary.

New	New (No Rec)	Upgraded	Upgraded (No Rec)
PASS	PASS	PASS	PASS	tsh ssh \<regular-node>
PASS	PASS	PASS	PASS	tsh ssh \<node-remote-cluster>
PASS	PASS	PASS	PASS	tsh ssh -A \<regular-node>
PASS	PASS	PASS	PASS	tsh ssh -A \<node-remote-cluster>
PASS	PASS	PASS	PASS	tsh ssh \<regular-node> ls
PASS	PASS	PASS	PASS	tsh ssh \<node-remote-cluster> ls
PASS	PASS	PASS	PASS	tsh join \<regular-node>
PASS	PASS	PASS	PASS	tsh join \<node-remote-cluster>
PASS	*PASS	PASS	*PASS	tsh play \<regular-node>
PASS	*PASS	PASS	*PASS	tsh play \<node-remote-cluster>
PASS	PASS	PASS	PASS	tsh scp \<regular-node>
PASS	PASS	PASS	PASS	tsh scp \<node-remote-cluster>
PASS	PASS	PASS	PASS	tsh ssh -L \<regular-node>
PASS	PASS	PASS	PASS	tsh ssh -L \<node-remote-cluster>
PASS	PASS	PASS	PASS	tsh ls
PASS	PASS	PASS	PASS	tsh clusters

= failed with ERROR: 0 not found, which I assume is the correct behaviour when recording is disabled

tcsc commented 3 years ago

Encountered #6938 while testing: Panic when using tctl with remote auth server

kimlisa commented 3 years ago

mfa related bug, where scp upload/download does not work in the web ui: https://github.com/gravitational/teleport/issues/6939

r0mant commented 3 years ago

@Joerger @xacrimon Seeing https://github.com/gravitational/teleport/issues/6935 as well which Brian reported above.

@xacrimon Looks like this file (dynamic.go) was a part of your RFD19 implementation, could this have caused it? Just need to add user.updated event to the switch probably.

xacrimon commented 3 years ago

@r0mant Resolved in #6949 and #6950 backport to v6.

fspmarshall commented 3 years ago

Changes introduced in #6731 break compatibility with older 6.X instances due to reliance on new GRPC methods (e.g. attempting to view audit events from UI of a 6.2 proxy results in unknown method GetEvents for service proto.AuthService error when dealing with a 6.1 auth server).

Teleport should fallback to using old event API if new one is not available.

cc: @xacrimon @kimlisa

xacrimon commented 3 years ago

@fspmarshall So this is a bit of an issue. The old events API does not support pagination but the IAuditLog interface expects it. Should we just ignore the new parameters introduced in RFD 19 and pretend pagination doesn't exist on fallback?

kimlisa commented 3 years ago

ui switchback bug (i am fixing): https://github.com/gravitational/teleport/issues/6960 @xacrimon related to #6935, unknown event bug: https://github.com/gravitational/teleport/issues/6959

fspmarshall commented 3 years ago

Should we just ignore the new parameters introduced in RFD 19 and pretend pagination doesn't exist on fallback?

@xacrimon Followed up in PR. Basically, I think we should pretend it doesn't exist when dealing with the first call (since that means we're getting the "first page", which is what the old API did), but we should return an error if startKey != "", since that means we're loading a subsequent page, which the old API can't do.

awly commented 3 years ago

@xacrimon @webvictim @fspmarshall @quinqu let me know if you're overloaded. Some other folks are done with their testing so I could re-distribute remaining tasks if needed.

quinqu commented 3 years ago

@awly i could use some help on the U2F second factor tests as i do not have a U2F device.

awly commented 3 years ago

@quinqu will do :+1:

awly commented 3 years ago

FYI everyone, if you find an issue while testing, please file a bug and put it into 6.2 milestone. That way I can track all the remaining work and questions.

xacrimon commented 3 years ago

I have previously assumed DynamoDB tests were running but they have not been. I still need to hook these up and run them before I can say everything is correct. I will make another comment but please do not cut before I confirm that everything is indeed working @awly. @russjones I've also merged the API compat PR. #6990 will need to be merged as well, I will ping for reviews when it is ready.

webvictim commented 3 years ago

Ran into some weird tsh logout behaviour, detailed in https://github.com/gravitational/teleport/issues/6992

Not sure if this is a blocker but I can't log out of all my clusters for some reason.

xacrimon commented 3 years ago

Okay. I have pinged reviews on #6990 and I sign off on everything working when it is merged. I’ve manually done some testing to make sure it works.

webvictim commented 3 years ago

Most Kubernetes tests are finished, just waiting on #6990 merge/backport (and rc.2 cut?) to verify the audit log entries:

awly commented 3 years ago

All issues are either resolved or not caused by 6.2. Marking the testplan as done.

russjones commented 3 years ago

From @fspmarshall

6.2 - etcd - IoT

tsh bench --duration=30m root@loadtest-665c98bfb5-72w58 ls
* Requests originated: 17920
* Requests failed: 258
* Last error: connection closed
Histogram
Percentile Response Duration
---------- -----------------
25         4867 ms
50         6943 ms
75         9583 ms
90         14951 ms
95         20959 ms
99         40799 ms
100        65439 ms

tsh bench --interactive --duration=30m root@loadtest-665c98bfb5-9wk2b ps aux
* Requests originated: 17905
* Requests failed: 253
* Last error: connection error: desc = "transport: authentication handshake failed: EOF"
Histogram
Percentile Response Duration
---------- -----------------
25         4923 ms
50         7079 ms
75         9727 ms
90         15015 ms
95         20783 ms
99         41951 ms
100        64927 ms

6.2 - etcd - non-IoT

tsh bench --duration=30m root@loadtest-665c98bfb5-qcf82 ls
* Requests originated: 17983
* Requests failed: 23
* Last error: connection error: desc = "transport: authentication handshake failed: EOF"
Histogram
Percentile Response Duration
---------- -----------------
25         4719 ms
50         6567 ms
75         8703 ms
90         11143 ms
95         13439 ms
99         21263 ms
100        49183 ms

tsh bench --interactive --duration=30m root@loadtest-665c98bfb5-zfsrb ps aux
* Requests originated: 17970
* Requests failed: 17
* Last error: connection error: desc = "transport: authentication handshake failed: EOF"
Histogram
Percentile Response Duration
---------- -----------------
25         4655 ms
50         6391 ms
75         8327 ms
90         10703 ms
95         13079 ms
99         21759 ms
100        59423 ms

gravitational / teleport

Teleport 6.2 Test Plan #6651

Manual Testing Plan

Combinations @fspmarshall @quinqu

Teleport with multiple Kubernetes clusters @xacrimon @webvictim

Helm charts

Migrations @tcsc @nklaassen

Command Templates

OpenSSH

Teleport

Teleport Plugins @awly @Joerger

WEB UI @kimlisa @alex-kovoy

Main

Top Nav

Side Nav

Servers aka Nodes

Applications

Databases

Active Sessions

Audit log

Users

Auth Connectors

Auth Connectors Card Icons

Roles

Managed Clusters

Help & Support

Access Requests

Creating Access Rquests

Viewing & Approving/Denying Requests

Assuming Approved Requests

Access Request Waiting Room

Strategy Reason

Strategy Always

Strategy Optional

Account

Terminal

Node List Tab

Session Tab

Session Player

Invite Form

Login Form

Multi-factor Authentication (mfa)

MFA create, login, password reset

MFA require auth

RBAC

Performance/Soak Test @xacrimon @fspmarshall

Application Access @r0mant @smallinsky

Database Access @r0mant @smallinsky

6.2 - etcd - IoT

6.2 - etcd - non-IoT