census-instrumentation / opencensus-service

OpenCensus service allows OpenCensus libraries to export to an exporter service rather than having to link vendor-specific exports.
Apache License 2.0
153 stars 63 forks source link

omnition/opencensus-agent:0.1.9 terminating #619

Closed DazWilkin closed 5 years ago

DazWilkin commented 5 years ago

Please answer these questions before submitting a bug report.

What version are you using?

omnition/opencensus-agent:0.1.9

What did you do?

Version bumped the agent from 0.1.8 to 0.1.9

What did you expect to see?

Continued working behavior (as with 0.1.8):

opencensus-agent_1           | 2019/08/08 21:27:47 Setting Stackdriver default location failed: Get http://169.254.169.254/computeMetadata/v1/instance/zone: dial tcp 169.254.169.254:80: i/o timeout
opencensus-agent_1           | {"level":"info","ts":1565299667.1677186,"caller":"config/config.go:497","msg":"Metrics Exporter enabled","exporter":"stackdriver"}
opencensus-agent_1           | {"level":"info","ts":1565299667.1684854,"caller":"config/config.go:497","msg":"Metrics Exporter enabled","exporter":"prometheus"}
opencensus-agent_1           | 2019/08/08 21:27:48 Running OpenCensus Trace and Metrics receivers as a gRPC service at ":55678"
opencensus-agent_1           | 2019/08/08 21:27:48 Running zPages on port 9999

What did you see instead?

Recurring Received "terminated" signal from OS, terminating process

opencensus-agent_1           | {"level":"info","ts":1565120748.4915674,"caller":"config/config.go:497","msg":"Metrics Exporter enabled","exporter":"stackdriver"}
opencensus-agent_1           | {"level":"info","ts":1565120748.4921167,"caller":"config/config.go:497","msg":"Metrics Exporter enabled","exporter":"prometheus"}
opencensus-agent_1           | 2019/08/06 19:45:49 Running OpenCensus Trace and Metrics receivers as a gRPC service at ":55678"
opencensus-agent_1           | 2019/08/06 19:45:49 Running zPages on port 9999
opencensus-agent_1           | 2019/08/06 19:52:15 Received "terminated" signal from OS, terminating process
opencensus-agent_1           | {"level":"info","ts":1565204467.3861768,"caller":"config/config.go:497","msg":"Metrics Exporter enabled","exporter":"stackdriver"}
opencensus-agent_1           | {"level":"info","ts":1565204467.3866544,"caller":"config/config.go:497","msg":"Metrics Exporter enabled","exporter":"prometheus"}
opencensus-agent_1           | 2019/08/07 19:01:08 Running OpenCensus Trace and Metrics receivers as a gRPC service at ":55678"
opencensus-agent_1           | 2019/08/07 19:01:08 Running zPages on port 9999
opencensus-agent_1           | 2019/08/07 19:02:13 Received "terminated" signal from OS, terminating process
opencensus-agent_1           | {"level":"info","ts":1565204700.1680114,"caller":"config/config.go:497","msg":"Metrics Exporter enabled","exporter":"stackdriver"}
opencensus-agent_1           | {"level":"info","ts":1565204700.1695867,"caller":"config/config.go:497","msg":"Metrics Exporter enabled","exporter":"prometheus"}
opencensus-agent_1           | 2019/08/07 19:05:01 Running OpenCensus Trace and Metrics receivers as a gRPC service at ":55678"
opencensus-agent_1           | 2019/08/07 19:05:01 Running zPages on port 9999

Additional context

Add any other context about the problem here.

I'm running the containers using Docker Compose and am confident the singular change was the version bump.

ocagent.yaml:

receivers:
  opencensus:
    address: ":55678"

exporters:
  stackdriver:
    project: ${GCP_PROJECT}
    enable_metrics: true
    enable_trace: true
  prometheus:
    address: ":9090"

zpages:
    port: 9999
pjanotti commented 5 years ago

Hi @DazWilkin can you provide the docker-compose file that you are using or some other steps to repro it? I gave a few tries and wasn't able to repro it.

DazWilkin commented 5 years ago

Here's the docker-compose entry:

  opencensus-agent:
    image: omnition/opencensus-agent:0.1.9
    command:
    - --config=/configs/ocagent.yaml
    environment:
      GOOGLE_APPLICATION_CREDENTIALS: /secrets/opencensus.json
    volumes:
    - "${PWD}/image-transparency-190806-2a307f0c3096.json:/secrets/opencensus.json"
    - "${PWD}/ocagent.yaml:/configs/ocagent.yaml"
    expose:
    - "9090"  # ocagent.yaml defined Prometheus Exporter
    - "9999"  # ocagent.yaml defined zPages
    - "55678" # ocagent.yaml defined OpenCensus Receiver
    ports:
    - 9190:9090 # Debugging: Prometheus Metrics Exporter
    - 9999:9999 # Debugging: zPages
    healthcheck:
      test:
      - CMD
      - curl
      - --fail
      - "http://opencensus-agent:9090/metrics"
      interval: 30s
      timeout: 30s
      retries: 3

I think I may know what I did wrong.

I think I had the Prometheus endpoint set to :9100 rather than :9090 and the healthcheck was failing and restarting. I'll try to repro that next week and will update. Until then, please spent no more time investigating. Apologies!

DazWilkin commented 5 years ago

I'm unable to repro the issue.

I think it wasn't due to a misconfigured healthcheck (as I tried that).