jenkinsci / opentelemetry-plugin

Monitor and observe Jenkins with OpenTelemetry.
https://plugins.jenkins.io/opentelemetry/
Apache License 2.0
96 stars 50 forks source link

With `otel.logs.mirror_to_disk=true` enabled link to Grafana isn't visible in Console Output #773

Open chewrocca opened 9 months ago

chewrocca commented 9 months ago

What feature do you want to see added?

If otel.logs.mirror_to_disk=true is enabled, the link to Grafana Loki logs doesn't appear under Console Output.

Upstream changes

No response

Are you interested in contributing this feature?

No response

cyrille-leclerc commented 3 months ago

FYI

cyrille-leclerc commented 2 months ago

It would be complicated to insert the link "View logs in Grafana" as we would have to reimplement the visualization of logs stored in the Jenkins Home file system.

Please see our new capability to visualize in the Jenkins GUI the pipeline logs stored in Loki and tell us if this enhancement solves your need.

image

chewrocca commented 2 months ago

Will this work if we have Loki write and read at different URLs? I think we probably need both read/write on that same lokiUrl?

cyrille-leclerc commented 2 months ago

It's not really "loki write", it's OTLP logs endpoint, typically an OTel Collector on the Jenkins Controller. With this, it makes perfect sense to have different hosts for

Does it make sense?

chewrocca commented 2 months ago

image_720

I must be doing something incorrectly. Clicking Console Output shows a link to View Logs in Grafana, however, it never shows the logs in Jenkins console output.

Do you have an example of jcasc for this setup?

https://github.com/user-attachments/assets/a48c75ff-346f-4d21-9ae9-7a2c22cd4f8a

ArieLevs commented 1 month ago

@chewrocca I've had exactly same issue (with other similar issues), after playing around with the configs I got the plugin working properly showing logs in UI in addition to Grafana.

It would be complicated to insert the link "View logs in Grafana" as we would have to reimplement the visualization of logs stored in the Jenkins Home file system.

Please see our new capability to visualize in the Jenkins GUI the pipeline logs stored in Loki and tell us if this enhancement solves your need.

image

same configs as mentioned above, but I had to add Loki tenant Id and under Loki Datasource Identifier I had to set the UID of the data source not its name (while this data source is also restricted to the tenant id), hope this may help

cyrille-leclerc commented 1 month ago

@ArieLevs super interesting, I have not tested with a Loiki Tenant ID (I test with Grafana Cloud). Do you mind creating an issue if log retrieving is bugged when using a Loki Tenant ID?

cyrille-leclerc commented 1 month ago

FYI I'm adding capabilities to detect problems sending logs from Jenkins build agents to the OTLP logs endpoint thanks to a capability to check connectivity from the OTel SDK running in the Jenkins build agent to their OTLP destination:

chewrocca commented 1 month ago

I tried setting the Loki Tenant ID but I'm still getting the same results. I have the Loki Datasource Identifier set to the UID. I'm using Loki OSS and not cloud.

cyrille-leclerc commented 1 month ago

@chewrocca if you still face the problem below, please create an issue, it's a different problem from "When otel.logs.mirror_to_disk=true is enabled, there is no link to the visualization of these logs in the observability backend". The former is a bug, the latter is an enhancement request.

image_720

I must be doing something incorrectly. Clicking Console Output shows a link to View Logs in Grafana, however, it never shows the logs in Jenkins console output.

Do you have an example of jcasc for this setup?

jenkins-otel-plugin-clip.mp4

ArieLevs commented 1 month ago

I'm not able to point directly on the problem, as it took me many attempts to get it working, for example (and again I;m not 100% sure) when the tenant ID in Jenkins configs is not identical to the tenant ID I've configured in otel collector, it didn't worked.

here are my configs that successfully work: (I'm running a dedicated otel collector just for Jenkins)

exporters:
  debug: {}
  loki:
    auth:
      authenticator: basicauth/loki
    endpoint: https://<LOKI_HOST>/loki/api/v1/push
    headers:
      X-Scope-OrgID: <JENKINS_TENANT_ID>
  otlphttp:
    auth:
      authenticator: basicauth/tempo
    endpoint: https://<TEMPO_HOST>
  otlphttp/loki:
    auth:
      authenticator: basicauth/loki
    endpoint: https://<LOKI_HOST>/otlp
    headers:
      X-Scope-OrgID: <JENKINS_TENANT_ID>
extensions:
  basicauth/loki:
    client_auth:
      password: ${env:LOKI_PASSWORD}
      username: ${env:LOKI_USERNAME}
  basicauth/tempo:
    client_auth:
      password: ${env:TEMPO_PASSWORD}
      username: ${env:TEMPO_USERNAME}
  bearertokenauth:
    token: ${env:BEARER_TOKEN}
processors:
  batch: {}
  memory_limiter:
    check_interval: 5s
    limit_percentage: 80
    spike_limit_percentage: 25
receivers:
  otlp:
    protocols:
      grpc:
        auth:
          authenticator: bearertokenauth
        endpoint: ${env:MY_POD_IP}:4317
        include_metadata: true
      http:
        auth:
          authenticator: bearertokenauth
        endpoint: ${env:MY_POD_IP}:4318
        include_metadata: true
service:
  extensions:
  - health_check
  - memory_ballast
  - bearertokenauth
  - basicauth/loki
  - basicauth/tempo
  pipelines:
    logs:
      exporters:
      - debug
      - otlphttp/loki
      processors:
      - memory_limiter
      - batch
      receivers:
      - otlp
      - k8sobjects
    metrics:
      exporters:
      - debug
      processors:
      - memory_limiter
      - batch
      receivers:
      - otlp
    traces:
      exporters:
      - otlphttp
      processors:
      - memory_limiter
      - batch
      receivers:
      - otlp

and Jenkins CASC otel configs:

openTelemetry:
  authentication:
    bearerTokenAuthentication:
      tokenId: "otel-access-token"
  disabledResourceProviders: "io.opentelemetry.instrumentation.resources.ProcessResourceProvider"
  endpoint: "http://JENKIS_OTEL_COLLECTOR_HOST:4317"
  exportOtelConfigurationAsEnvironmentVariables: true
  ignoredSteps: "dir,echo,isUnix,pwd,properties"
  observabilityBackends:
    - grafana:
        grafanaBaseUrl: "https://<GRAFANA_HOST>"
        grafanaLogsBackend:
          grafanaLogsBackendWithJenkinsVisualization:
            grafanaLokiDatasourceIdentifier: "<LOKI_DATASOURCE_UID_FOR_LOGS_FROM_JENKINS>" # Note this is a datasource that is configured with the "<JENKINS_TENANT_ID>" header
            lokiCredentialsId: "loki"
            lokiOTelLogFormat: "LOKI_V3_OTEL_FORMAT"
            lokiTenantId: "<JENKINS_TENANT_ID>"
            lokiUrl: "https://<LOKI_HOST>"
        tempoDataSourceIdentifier: "<TEMPO_TRACES_UID>"
  serviceName: "jenkins"
  serviceNamespace: "jenkins"

please note that while above is working fine (getting the Grafana icon with a link that redirects me to Grafana UI with a specific trace filter),
I had to revert to UI only logs and disable this for now until https://github.com/jenkinsci/opentelemetry-plugin/issues/918 is solved