jenkinsci / opentelemetry-plugin

Monitor and observe Jenkins with OpenTelemetry.
https://plugins.jenkins.io/opentelemetry/
Apache License 2.0
98 stars 50 forks source link

Missing shell output in Kubernetes based environment #482

Closed timmjd closed 2 years ago

timmjd commented 2 years ago

Jenkins and plugins versions report

Environment ```text Jenkins: 2.346.1 OS: Linux - 5.4.196-108.356.amzn2.x86_64 --- Parameterized-Remote-Trigger:3.1.6.1 ace-editor:1.1 ansicolor:1.0.1 antisamy-markup-formatter:2.7 anything-goes-formatter:1.0 apache-httpcomponents-client-4-api:4.5.13-1.0 authentication-tokens:1.4 badge:1.9.1 basic-branch-build-strategies:1.3.2 blueocean:1.25.5 blueocean-autofavorite:1.2.5 blueocean-bitbucket-pipeline:1.25.5 blueocean-commons:1.25.5 blueocean-config:1.25.5 blueocean-core-js:1.25.5 blueocean-dashboard:1.25.5 blueocean-display-url:2.4.1 blueocean-events:1.25.5 blueocean-git-pipeline:1.25.5 blueocean-github-pipeline:1.25.5 blueocean-i18n:1.25.5 blueocean-jwt:1.25.5 blueocean-personalization:1.25.5 blueocean-pipeline-api-impl:1.25.5 blueocean-pipeline-editor:1.25.5 blueocean-pipeline-scm-api:1.25.5 blueocean-rest:1.25.5 blueocean-rest-impl:1.25.5 blueocean-web:1.25.5 bootstrap5-api:5.1.3-7 bouncycastle-api:2.26 branch-api:2.1046.v0ca_37783ecc5 caffeine-api:2.9.3-65.v6a_47d0f4d1fe checks-api:1.7.4 cloudbees-bitbucket-branch-source:773.v4b_9b_005b_562b_ cloudbees-disk-usage-simple:0.10 cloudbees-folder:6.729.v2b_9d1a_74d673 command-launcher:1.2 commons-lang3-api:3.12.0.0 commons-text-api:1.9-9.v39a_53e2e0343 configuration-as-code:1466.v2d4119502006 copyartifact:1.46.4 credentials:1139.veb_9579fca_33b_ credentials-binding:523.vd859a_4b_122e6 display-url-api:2.3.6 docker-commons:1.19 docker-workflow:1.29 dtkit-api:3.0.1 durable-task:496.va67c6f9eefa7 echarts-api:5.3.3-1 email-ext:2.89 extended-read-permission:3.2 favorite:2.4.1 font-awesome-api:6.1.1-1 git:4.11.3 git-client:3.11.0 github:1.34.4 github-api:1.303-400.v35c2d8258028 github-branch-source:1656.v77eddb_b_e95df groovy-postbuild:2.5 handlebars:3.0.8 handy-uri-templates-2-api:2.1.8-22.v77d5b_75e6953 hashicorp-vault-plugin:351.vdb_f83a_1c6a_9d htmlpublisher:1.30 http_request:1.15 jackson2-api:2.13.3-285.vc03c0256d517 javax-activation-api:1.2.0-3 javax-mail-api:1.6.2-6 jaxb:2.3.6-1 jdk-tool:1.0 jenkins-design-language:1.25.5 jjwt-api:0.11.5-77.v646c772fddb_0 job-dsl:1.80 jobConfigHistory:1155.v28a_46a_cc06a_5 jquery-detached:1.2.1 jquery3-api:3.6.0-4 jsch:0.1.55.2 junit:1119.1121.vc43d0fc45561 kubernetes:3651.v908e7db_10d06 kubernetes-cli:1.10.3 kubernetes-client-api:5.12.2-193.v26a_6078f65a_9 kubernetes-credentials:0.9.0 ldap:2.10 locale:180.v207501dff9b_a_ lockable-resources:2.15 mailer:414.vcc4c33714601 matrix-auth:2.6.11 matrix-project:772.v494f19991984 metrics:4.1.6.2 momentjs:1.1.1 monitoring:1.91.0 okhttp-api:4.9.3-105.vb96869f8ac3a opentelemetry:2.9.0-rc1 permissive-script-security:0.7 pipeline-build-step:2.18 pipeline-graph-analysis:195.v5812d95a_a_2f9 pipeline-groovy-lib:593.va_a_fc25d520e9 pipeline-input-step:449.v77f0e8b_845c4 pipeline-milestone-step:101.vd572fef9d926 pipeline-model-api:2.2097.v33db_b_de764b_e pipeline-model-definition:2.2097.v33db_b_de764b_e pipeline-model-extensions:2.2097.v33db_b_de764b_e pipeline-rest-api:2.24 pipeline-stage-step:293.v200037eefcd5 pipeline-stage-tags-metadata:2.2097.v33db_b_de764b_e pipeline-stage-view:2.24 pipeline-utility-steps:2.13.0 plain-credentials:1.8 plugin-util-api:2.17.0 popper2-api:2.11.5-2 prometheus:2.0.10 pubsub-light:1.16 robot:3.2.0 role-strategy:3.2.0 scm-api:608.vfa_f971c5a_a_e9 script-security:1175.v4b_d517d6db_f0 simple-theme-plugin:103.va_161d09c38c7 snakeyaml-api:1.30.2-76.vc104f7ce9870 sse-gateway:1.25 ssh-agent:295.v9ca_a_1c7cc3a_a_ ssh-credentials:291.v8211e4f8efb_c ssh-slaves:1.821.vd834f8a_c390e sshd:3.0.3 structs:318.va_f3ccb_729b_71 timestamper:1.18 token-macro:293.v283932a_0a_b_49 trilead-api:1.57.v6e90e07157e1 variant:1.4 windows-slaves:1.8.1 workflow-aggregator:590.v6a_d052e5a_a_b_5 workflow-api:1188.v0016b_4f29881 workflow-basic-steps:969.vc4ec3e4854b_f workflow-cps:2746.v0da_83a_332669 workflow-durable-task-step:1174.v73a_9a_17edce0 workflow-job:1203.v7b_7023424efe workflow-multibranch:716.vc692a_e52371b_ workflow-scm-step:400.v6b_89a_1317c9a_ workflow-step-api:625.vd896b_f445a_f8 workflow-support:833.va_1c71061486b_ xunit:3.1.0 ```

What Operating System are you using (both controller, and any agents involved in the problem)?

Linux, Kubernetes on AWS

Reproduction steps

Expected Results

The output of the ls -la should be visible:

+ ls -la
total 0
drwxr-xr-x 2 jenkins jenkins  6 Jul 12 18:47 .
drwxr-xr-x 4 jenkins jenkins 34 Jul 12 18:47 ..

Actual Results

Output of shell is not visible inside the log

Anything else?

Log combination with native instance & Kubernetes based.

Build-In Node

node() {
    sh('ls -la')
}

Kubernetes based

podTemplate {
    node(POD_LABEL) {
        sh('ls -la')
    }
}

The log seems to never reached Elastic - also with the Store pipeline logs In Elastic and visualize logs exclusively in Elastic option the sh output is not visible.

[Pipeline] Start of Pipeline
[Pipeline] podTemplate
[Pipeline] {
[Pipeline] node
Created Pod: cmo-prod.vni-pmt-ci vni-pmt-ci/test-49-j06nk-1t4ft-0dphk
Agent test-49-j06nk-1t4ft-0dphk is provisioned from template test_49-j06nk-1t4ft
Running on test-49-j06nk-1t4ft-0dphk in /home/jenkins/agent/ws/test
[Pipeline] {
[Pipeline] sh
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // podTemplate
[Pipeline] End of Pipeline
Finished: SUCCESS
cyrille-leclerc commented 2 years ago

Thanks @timmjd for the detailed report.

@jglick are you aware of such a limitation of JEP 210? A limitation where the shell console of the Jenkins Kubernetes Plugin would not be routed to the h.m.BuildListener provided by the o.j.p.w.log.LogStorage?

@kuisathaverat @v1v can we test on our environment?

kuisathaverat commented 2 years ago

In our test environment, everything is running on K8s, we did not see this issue. One note here is that Agents should have access to the OpenTelemetry Collector (or whatever service is getting the OpenTelemetry logs) because the Agent logs are sent from the Agent not from the Jenkins Controller, what I see in your logs is only the logs messages sent from the Jenkins Controller, so it looks like the Agent can not access the OpenTelemetry service.

cyrille-leclerc commented 2 years ago

Good catch @kuisathaverat , you rock! @timmjd can you verify the OTLP endpoint you configured in the Jenkins OpenTelemetry plugin configuration are also valid from the Kubernetes build agents?

timmjd commented 2 years ago

Damn, that was the missing thing.

Our job runs inside a separate namespace of the cluster - and I only used the local DNS http://default-elasticsearch-ccif-ot-collector:4317 instead of the global one http://default-elasticsearch-ccif-ot-collector.vni-pmt.svc.cluster.local:4317.

I finally exposed the OTLP endpoint via a nginx based ingress - should now work for OnPrem and in cluster.

During debugging, I also found the error inside the JNLP agent: SEVERE: Failed to export logs. The request could not be executed. Full error message: default-elasticsearch-ccif-ot-collector: Name or service not known

Not sure if possible, but sending this information to the server would have make it more obvious.

jglick commented 2 years ago

are you aware of such a limitation of JEP 210

304 IIUC? :eyes:

sending this information to the server would have make it more obvious

The kubernetes plugin does not stream agent log messages (like stderr from java -jar agent.jar) to the build log. Would probably be helpful to do so. It did stream K8s events pertaining to the pod to the build log, but this was removed in https://github.com/jenkinsci/kubernetes-plugin/pull/1192. CC @Vlatombe

cyrille-leclerc commented 2 years ago

are you aware of such a limitation of JEP 210

304 IIUC? 👀

Thanks @jglick . @kuisathaverat found the problem, it's not related to JEP-210, it as caused by the OpenTelemetry endpoint configuration that didn't work from within the Elastic Kubernetes agents so the problem is solved and JEP-210 works well with an OpenTelemetry logs backend and the Jenkins Kubernetes Plugin.