grafana / agent

Vendor-neutral programmable observability pipelines.
https://grafana.com/docs/agent/
Apache License 2.0
1.6k stars 487 forks source link

Faro demo doesn't start anymore with latest grafana agent (0.35) #4632

Closed codecapitano closed 1 year ago

codecapitano commented 1 year ago

What's wrong?

Since v0.35 the agent doesn't start anymore for the Faro demo which causes the demo page to be unavailable. A few customers reported the same problem. No configuration was changed so it may be that the issue is related to the upgrade.

Error message is:

faro-web-sdk-agent-1     | ts=2023-07-27T13:44:08.848273542Z caller=main.go:74 level=error msg="error creating the agent server entrypoint" err="failed to construct app_agent_receiver integration \"@grafana/faro-demo\": push receiver factory not found for traces instance \"@grafana/faro-demo\""

Steps to reproduce

docker compose pull docker compose up

System information

Mac OS Ventura 13.4.1

Software version

Grafana Agent 0.35

Configuration

server:
  log_level: 'debug'

metrics:
  wal_directory: '${AGENT_TEMP_PATH}/${AGENT_WAL_PATH}'
  global:
    scrape_interval: '60s'
    remote_write:
      - url: 'http://${CORTEX_HOST}:${CORTEX_PORT}/api/prom/push'
  configs:
    - name: '${DEMO_PACKAGE_NAME}'
      scrape_configs:
        - job_name: '${DEMO_SERVER_PACKAGE_NAME}'
          static_configs:
            - targets:
                - '${DEMO_HOST}:${DEMO_PORT}'

logs:
  configs:
    - name: '${DEMO_CLIENT_PACKAGE_NAME}'
      clients:
        - url: 'http://${LOKI_HOST}:${LOKI_PORT}/loki/api/v1/push'
      positions:
        filename: '/tmp/positions-client.yaml'
    - name: '${DEMO_SERVER_PACKAGE_NAME}'
      clients:
        - url: 'http://${LOKI_HOST}:${LOKI_PORT}/loki/api/v1/push'
      positions:
        filename: '/tmp/positions-server.yaml'
      scrape_configs:
        - job_name: '${DEMO_SERVER_PACKAGE_NAME}'
          static_configs:
            - targets:
                - 'localhost'
              labels:
                app: '${DEMO_SERVER_PACKAGE_NAME}'
                __path__: '${AGENT_LOGS_PATH}/${DEMO_SERVER_LOGS_NAME}'

traces:
  configs:
    - name: '${DEMO_PACKAGE_NAME}'
      remote_write:
        - endpoint: '${TEMPO_HOST}:${TEMPO_PORT_OTLP_RECEIVER}'
          insecure: true
      receivers:
        otlp:
          protocols:
            grpc:

integrations:
  app_agent_receiver:
    autoscrape:
      enable: true
      metrics_instance: '${DEMO_PACKAGE_NAME}'
    instance: '${DEMO_PACKAGE_NAME}'
    logs_instance: '${DEMO_CLIENT_PACKAGE_NAME}'
    logs_labels:
      app: '${DEMO_CLIENT_PACKAGE_NAME}'
      kind: ''
    logs_send_timeout: '5s'
    server:
      api_key: '${AGENT_KEY_APP_RECEIVER}'
      cors_allowed_origins:
        - '*'
      host: '0.0.0.0'
      max_allowed_payload_size: 5e+07
      port: ${AGENT_PORT_APP_RECEIVER}
      rate_limiting:
        burstiness: 100
        enabled: true
        rps: 100
    sourcemaps:
      download: true
    traces_instance: '${DEMO_PACKAGE_NAME}'

Logs

faro-web-sdk-agent-1     | ts=2023-07-27T13:44:08.848273542Z caller=main.go:74 level=error msg="error creating the agent server entrypoint" err="failed to construct app_agent_receiver integration \"@grafana/faro-demo\": push receiver factory not found for traces instance \"@grafana/faro-demo\""
rfratto commented 1 year ago

Thanks for reporting, we'll take a look at this as soon as we can.

cc @ptodev, this sounds related to the OpenTelemetry upgrade.

angel1254mc commented 1 year ago

Hi team! I was investigating this a day ago and wanted to put some of my findings in case they are of any help.

Here's a link to the file with the change I think is causing the error (pkg/traces/instance.go). It looks like the line inside of BuildAndStartPipeline that populates the factories in the trace instance (line 171) gets removed

i.factories = factories // this gets removed

and the factories map (and a lot of other stuff) now lives under i.services

i.service, err = service.New(ctx, service.Settings{
        ...
        Exporters:                otelreceivers.NewBuilder(otelConfig.Receivers, factories.Receivers),
        ...
        },

so later down the line when GetFactory is called, i.factories.Receivers is an empty map that yields null for pushreceiver, resulting in the push receiver factory not found for traces instance error. Haven't had the chance to test this out myself but thought it was worth mentioning 😄