open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.02k stars 2.33k forks source link

Unable to start the otel-collector-contrib and using 0.61 version #14725

Closed amitvermaa3101 closed 2 years ago

amitvermaa3101 commented 2 years ago

Describe the issue you're reporting

Unable to start the otel-collector-contrib and using 0.61 version

getting this error 👍

Timestamp: 2022-10-05 10:43:43.902781738 +0000 UTC Value: 1 {"kind": "exporter", "data_type": "metrics", "name": "logging"} 2022-10-05T10:44:43.949Z error scraperhelper/scrapercontroller.go:197 Error scraping metrics {"kind": "receiver", "name": "postgresql", "pipeline": "metrics", "error": "sql: Scan error on column index 2, name \"checkpoint_duration_write\": converting driver.Value type float64 (\"1.2687577e+07\") to a int64: invalid syntax", "scraper": "postgresql"} go.opentelemetry.io/collector/receiver/scraperhelper.(controller).scrapeMetricsAndReport go.opentelemetry.io/collector@v0.61.0/receiver/scraperhelper/scrapercontroller.go:197 go.opentelemetry.io/collector/receiver/scraperhelper.(controller).startScraping.func1 go.opentelemetry.io/collector@v0.61.0/receiver/scraperhelper/scrapercontroller.go:172 2022-10-05T10:44:43.983Z info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 32} 2022-10-05T10:44:43.983Z info ResourceMetrics #0

Below is the config file ''' extensions: health_check: pprof: endpoint: 0.0.0.0:1777 zpages: endpoint: 0.0.0.0:55679

receivers: postgresql: endpoint: 0.0.0.0:5432 username: XXXXX password: XXXXXX databases: XXXXXX collection_interval: 60s tls: insecure: true

hostmetrics:

scrapers:

cpu:

disk:

filesystem:

load:

memory:

network:

process:

processes:

paging:

processors: batch:

sqlquery:

driver: postgres

datasource: "host=localhost port=5432 user=XXXXX password=XXXXX sslmode=disable"

queries:

- sql : "SELECT COUNT(ID) FROM COMPANY"

metrics:

- metric_name: postgres.sqlquery.customercount

value_column: "count"

exporters: prometheus: endpoint: 0.0.0.0:9100 send_timestamps: true logging: loglevel: debug

service: extensions: [pprof, zpages, health_check] pipelines: metrics: receivers: [postgresql] processors: [batch] exporters: [prometheus, logging] '''

github-actions[bot] commented 2 years ago

Pinging code owners: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

evan-bradley commented 2 years ago

Thanks for reporting this. It's a little hard to read the current formatting of your configuration file. Could you reformat your original post, or leave a comment on this issue with your configuration inside a code block? It would also help if you could edit the issue title to a short summary describing your problem. Thanks in advance.

amitvermaa3101 commented 2 years ago

below is the config file

extensions:
  health_check:
  pprof:
    endpoint: 0.0.0.0:1777
  zpages:
    endpoint: 0.0.0.0:55679

receivers:
  postgresql:
      endpoint: 0.0.0.0:5432
      username: XXXXXX
      password: XXXXX
      databases: XXXXXX
      collection_interval: 60s
      tls:
        insecure: true

#  hostmetrics:
#    scrapers:
#      cpu:
#      disk:
#      filesystem:
#      load:
#      memory:
#      network:
#      process:
#      processes:
#      paging:

processors:
  batch:
#  sqlquery:
#    driver: postgres
#    datasource: "host=localhost port=5432 user=XXXXX password=XXXXX sslmode=disable"
#    queries:
#      - sql : "SELECT COUNT(ID) FROM COMPANY"
#        metrics:
#          - metric_name: postgres.sqlquery.customercount
#            value_column: "count"

exporters:
  prometheus:
    endpoint: 0.0.0.0:9100
    send_timestamps: true
  logging:
    loglevel: debug

service:
  extensions: [pprof, zpages, health_check]
  pipelines:
    metrics:
      receivers: [postgresql]
      processors: [batch]
      exporters: [prometheus, logging]
djaglowski commented 2 years ago

"checkpoint_duration_write": converting driver.Value type float64 ("1.2687577e+07") to a int64

This appears to be a data type mismatch.

What's interesting here is that the value is in scientific notation, which is typically meant to represent a decimal number. However, it may just be that postgresql represents a large int this way.

I think we have 3 options here:

  1. Modify mdatagen to be able to handle such cases. This would be tricky because the value is presumably an int most of the time, but would be a string that needs to be converted to a float and then int.
  2. Modify the query to format the value. It appears this is possible with `to_number(). This would likely need to apply to all numeric values retrieved, which would result in complex queries.
  3. Handle the case within the scraper code. Some form of this is probably the best solution.