codex-storage / cs-codex-dist-tests

Distributed System Tests for Nim-Codex
4 stars 4 forks source link

Labeling Pods for Distributed and Continuous Tests #39

Closed veaceslavdoina closed 1 year ago

veaceslavdoina commented 1 year ago

Intro

As for now, we are adding just basic labels to the Pods created during Dist-Tests and Continuous-Tests run

k describe pods -n ct-82476037-1e04-4d46-a61d-ce18c7bc0cbb

Name:             deploy-0-6b64679485-5zwxg
Namespace:        ct-82476037-1e04-4d46-a61d-ce18c7bc0cbb
Priority:         0
Service Account:  default
Node:             pool-4vcpu-8gb-f5fgj/10.110.0.3
Start Time:       Fri, 21 Jul 2023 12:22:56 +0300
Labels:           codex-test-node=dist-test-0
                  pod-template-hash=6b64679485
Screenshot Screenshot 2023-07-24 at 11 40 06

In that way it is not so suitable to configure logs shipping for Pods related to the Dist-Tests or Continuous-Tests. We should consider to label Pods in the way which should permit us to differentiate the Pods

Proposal

Every pod created during the tests run should contain the following lables

Criteria Categories Labels example
Test type dist-tests / continuous-tests tests-type=dist-tests
Component codex (nim-codex) / codex-contracts-eth / geth / prometheus app=codex
Run 20230721-085043 / 20230721-061018 runid=20230721-085043

In that way we will be able to

  1. Ship and parse logs separately for Dist-Tests and Continuous-Tests
  2. Have a separate index in Kibana for all Pods for Dist-Tests and Continuous-Tests
  3. Have a suitable way to filter logs in Kibana by run, test type and component

Reference

  1. Labels and Selectors
  2. Recommended Labels
  3. Using labels effectively
veaceslavdoina commented 1 year ago

@benbierens, looks like I've missed some important labels and we also should consider to have a consistency across the Kibana indices

_STATUS.log Labels
- :green_circle: tests-type=dist-tests
- :green_circle: app=codex
:green_circle: "category": "Tests.PeerDiscoveryTests" :orange_circle: category=Tests.PeerDiscoveryTests
:green_circle: "codexid": "7efa91" :orange_circle: codexid=7efa91
:green_circle: "fixturename": "PeerDiscoveryTests" :orange_circle: fixturename=PeerDiscoveryTests
:green_circle: "runid": "20230808-120324" :green_circle: runid=20230808-120324
:green_circle: "status": "Passed" -
:green_circle: "testduration": "1 mins, 34 secs" -
:green_circle: "testid": "9840c50" :orange_circle: testid=9840c50
:green_circle: "testname": "VariableNodesInPods[20]" :orange_circle: testname=VariableNodesInPods[20]
:orange_circle: "gethid": "%gethid%" :orange_circle: gethid=%gethid%
:orange_circle: "prometheusid": "%prometheusid%" :orange_circle: prometheusid=%prometheusid%
:orange_circle: "codexcontractsethid": "%codexcontractsethid%" :orange_circle: codexcontractsethid=%codexcontractsethid%

To update

  1. Add category
  2. Add codexid
  3. Add fixturename
  4. Add testid
  5. Add testname
  6. Add gethid
  7. Add prometheusid
  8. Add codexcontractsethid
benbierens commented 1 year ago

All right, so, yes there's no way to add test-duration and status pass/fail when the container is created. We don't know those things at that time. Additionally, some of these fields are applicable only in a dist-test situation, and not in a continuous-test situation. For example: testid. In dist-test, a Codex container in created for a specific test, so that makes sense. In continous-tests, the containers are already always there. All those tests use the same containers. I will set these not-applicable values so that they indicate they are not applicable in those situations.

veaceslavdoina commented 1 year ago

All right, so, yes there's no way to add test-duration and status pass/fail when the container is created. We don't know those things at that time.

That is clear and we don't need that in labels. In the table we have "-" for them and table itself to show consistency between _STATUS.log and Labels.

For example: testid. This is the GitHub short sha and if we will run Continuous fro GitHub Actions we will pass that value to the runner Pod.

benbierens commented 1 year ago

Oh, one more important thing. The 'codexid' on the pod labels will never match the 'codexid' in the STATUS.log files. This is because we use the version information provided by the codex node via API call to fill in the codexid in the status log file. Which we obviously can't call before we start the container and we have to set the codexid label on the pod before then. There will be a different id, probably it will be the docker image sha.

veaceslavdoina commented 1 year ago

Yeah, got the idea

But as you implemented - it will be the latest or sha (custom one we will set), both ways are working and yes, some search will be not so usefully, but we will be able to make a relations between the latest tag and codex app version.

  1. Find Pods logs by runid
  2. Find runid in the _STATUS.log and appropriate codexid