openshift-pipelines / pipeline-service

SaaS for Tekton Pipelines
Apache License 2.0
23 stars 44 forks source link

operator installs/owns config-logging now, enable zap like we were #971

Closed gabemontero closed 6 months ago

gabemontero commented 6 months ago

After running dev_setup.sh with this branch, here is the config map in question:

$ oc get cm config-logging -o yaml -n openshift-pipelines 
apiVersion: v1
data:
  loglevel.controller: info
  loglevel.webhook: info
  zap-logger-config: |
    {
      "level": "info",
      "development": false,
      "sampling": {
        "initial": 100,
        "thereafter": 100
      },
      "outputPaths": ["stdout"],
      "errorOutputPaths": ["stderr"],
      "encoding": "json",
      "encoderConfig": {
        "timeKey": "ts",
        "levelKey": "level",
        "nameKey": "logger",
        "callerKey": "caller",
        "messageKey": "msg",
        "stacktraceKey": "stacktrace",
        "lineEnding": "",
        "levelEncoder": "",
        "timeEncoder": "iso8601",
        "durationEncoder": "",
        "callerEncoder": ""
      }
    }
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"loglevel.controller":"info","loglevel.webhook":"info","zap-logger-config":"{\n  \"level\": \"info\",\n  \"development\": false,\n  \"sampling\": {\n    \"initial\": 100,\n    \"thereafter\": 100\n  },\n  \"outputPaths\": [\"stdout\"],\n  \"errorOutputPaths\": [\"stderr\"],\n  \"encoding\": \"json\",\n  \"encoderConfig\": {\n    \"timeKey\": \"ts\",\n    \"levelKey\": \"level\",\n    \"nameKey\": \"logger\",\n    \"callerKey\": \"caller\",\n    \"messageKey\": \"msg\",\n    \"stacktraceKey\": \"stacktrace\",\n    \"lineEnding\": \"\",\n    \"levelEncoder\": \"\",\n    \"timeEncoder\": \"iso8601\",\n    \"durationEncoder\": \"\",\n    \"callerEncoder\": \"\"\n  }\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{"operator.tekton.dev/last-applied-hash":"af733ccdb35360db5d08c73ff7ce5e443612ad11a91dffd02aa629c0e5aa0b21"},"labels":{"app.kubernetes.io/instance":"default","app.kubernetes.io/part-of":"tekton-chains","operator.tekton.dev/operand-name":"tektoncd-chains"},"name":"config-logging","namespace":"openshift-pipelines","ownerReferences":[{"apiVersion":"operator.tekton.dev/v1alpha1","blockOwnerDeletion":true,"controller":true,"kind":"TektonInstallerSet","name":"chain-jlm5p","uid":"42b4c5c6-b3ad-4f93-a555-9b12dd9db4a3"}]}}
    operator.tekton.dev/last-applied-hash: af733ccdb35360db5d08c73ff7ce5e443612ad11a91dffd02aa629c0e5aa0b21
  creationTimestamp: "2024-03-17T17:25:18Z"
  labels:
    app.kubernetes.io/instance: default
    app.kubernetes.io/part-of: tekton-chains
    operator.tekton.dev/operand-name: tektoncd-chains
  name: config-logging
  namespace: openshift-pipelines
  ownerReferences:
  - apiVersion: operator.tekton.dev/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: TektonInstallerSet
    name: chain-jlm5p
    uid: 42b4c5c6-b3ad-4f93-a555-9b12dd9db4a3
  resourceVersion: "50876"
  uid: 0e2470f7-eb03-4641-a712-e000904246ab

which matches upstream, and matches the zap ADR-6 changes that @ramessesii2 originally did with https://github.com/openshift-pipelines/pipeline-service/pull/634

The should fix the gitops race condition we are seeing in stage and now prod where the operator is in a battle to reconcile this config map with what we were trying to set here in this repo.

rh-pre-commit.version: 2.2.0 rh-pre-commit.check-secrets: ENABLED

enarha commented 6 months ago

I moved the configuration to the TektonConfig and fixed the triggers logs as well https://github.com/openshift-pipelines/pipeline-service/pull/972 . My PR was ready few hours ago, but clusterbot gave me troubles so it took a while to verify. BTW without the configuration in the TektonConfig, the triggers configmap and logs had the wrong keys ("message" vs "msg"). For pipelines, the configmap was coming good, but the logs had the wrong key until I deleted the pods and they were recreated.

enarha commented 6 months ago

I tried again what you did and got different and expected IMO result:

-> oc describe cm config-logging
Name:         config-logging
Namespace:    openshift-pipelines
Labels:       app.kubernetes.io/component=resolvers
              app.kubernetes.io/instance=default
              app.kubernetes.io/part-of=tekton-pipelines
              operator.tekton.dev/operand-name=tektoncd-pipelines
Annotations:  operator.tekton.dev/last-applied-hash: 32c5bf90cabcd64ea3ce3d8c3f1bab72105b4b913d9537467858c604d7e716df

Data
====
loglevel.webhook:
----
info
zap-logger-config:
----
{
  "level": "info",
  "development": false,
  "sampling": {
    "initial": 100,
    "thereafter": 100
  },
  "outputPaths": ["stdout"],
  "errorOutputPaths": ["stderr"],
  "encoding": "json",
  "encoderConfig": {
    "timeKey": "timestamp",
    "levelKey": "severity",
    "nameKey": "logger",
    "callerKey": "caller",
    "messageKey": "message",
    "stacktraceKey": "stacktrace",
    "lineEnding": "",
    "levelEncoder": "",
    "timeEncoder": "iso8601",
    "durationEncoder": "",
    "callerEncoder": ""
  }
}

loglevel.controller:
----
info

BinaryData
====

Events:  <none>
gabemontero commented 6 months ago

same core upstream stuff, minus the fact you used describe vs. oc get yam

whether we use your PR or mine comes down to the question I asked in your PR

enarha commented 6 months ago

My point is this PR does not work. See the values in the configmap I posted are not the values we want. For example, the value of the "messageKey" I got while testing this PR was "message", while the value we want is "msg". The same goes for "timeKey". Also checking only the configmap is not enough. I have encountered issues where the configmap has the right values, but they are not used by the pods until pod restart. Lastly, my PR also fixes the logs for the triggers pods, which was missed originally.

gabemontero commented 6 months ago

My point is this PR does not work. See the values in the configmap I posted are not the values we want. For example, the value of the "messageKey" I got while testing this PR was "message", while the value we want is "msg". The same goes for "timeKey". Also checking only the configmap is not enough. I have encountered issues where the configmap has the right values, but they are not used by the pods until pod restart. Lastly, my PR also fixes the logs for the triggers pods, which was missed originally.

closing this PR