rudderlabs / rudderstack-helm

Open-source, warehouse-first Customer Data Pipeline and Segment-alternative. Collects and routes clickstream data and builds your customer data lake on your data warehouse.
MIT License
62 stars 48 forks source link

telegraf sidecar container won't start due to influxdb issues #47

Open shaunlandau1973 opened 2 years ago

shaunlandau1973 commented 2 years ago

The telegraf sidecar is generating errors causing it to fail continuously ... something about influxdb:

2022-03-08T16:59:30Z E! [outputs.influxdb] when writing to [http://localhost:8086/]: Post http://localhost:8086//write?db=telegraf: dial tcp 127.0.0.1:8086: connect: connection refused 2022-03-08T16:59:30Z E! [agent] Error writing to outputs.influxdb: could not write any address 2022-03-08T16:59:40Z E! [outputs.influxdb] when writing to [http://localhost:8086/]: Post http://localhost:8086//write?db=telegraf: dial tcp 127.0.0.1:8086: connect: connection refused 2022-03-08T16:59:40Z E! [agent] Error writing to outputs.influxdb: could not write any address 2022-03-08T16:59:50Z E! [outputs.influxdb] when writing to [http://localhost:8086/]: Post http://localhost:8086/write?db=telegraf: dial tcp 127.0.0.1:8086: connect: connection refused 2022-03-08T16:59:50Z E! [agent] Error writing to outputs.influxdb: could not write any address 2022-03-08T17:00:00Z E! [outputs.influxdb] when writing to [http://localhost:8086]: Post http://localhost:8086/write?db=telegraf: dial tcp 127.0.0.1:8086: connect: connection refused 2022-03-08T17:00:00Z E! [agent] Error writing to outputs.influxdb: could not write any address 2022-03-08T17:00:10Z E! [outputs.influxdb] when writing to [http://localhost:8086]: Post http://localhost:8086/write?db=telegraf: dial tcp 127.0.0.1:8086: connect: connection refused

I'm very surprised that I cannot find any other observers of this issue. My EKS cluster is nothing out of the ordinary. What is this missing influxdb service?

rwrz commented 2 years ago

As far I understood, you need to configure your TELEGRAF to export its data to an influxdb deployment. By default, it points to localhost:8086, but is up to you to configure it. Feel free to disable if not required in your setup. You can see it here (values.yaml):

  config:
    mountPath: /etc/telegraf
    agent:
      interval: "10s"
    outputs:
      - influxdb:
          urls: []
          #            - "http://influxdb.monitoring.svc:8086"
          database: "telegraf"
smerrill commented 9 months ago

In our case, I changed telegraf to use a file output to /dev/null by editing the ConfigMap to change its output stanza to the following:

[[outputs.file]]
  files = ["/dev/null"]

If you just try to kill the sidecar it won't start because of the wait-for command on the main rudder-server container waiting for Telegraf to come up.