ChronixDB / chronix.ingester

Ingest from various data sources into Chronix
Apache License 2.0
24 stars 2 forks source link

corrupt input error #9

Closed zhashuyu closed 7 years ago

zhashuyu commented 7 years ago

I want redirect prometheus data to chronix and I got errors, please help

chronix ingester error

[boomer@boomer chronix.ingester]$ ./chronix.ingester -chronix-url=http://localhost:8983/solr/chronix -max-chunk-age=30m
INFO[0000] Recovering from checkpoint...                 source=ingester.go:134
INFO[0000] Recovered 0 series with 0 chunks from checkpoint.  source=ingester.go:139
INFO[0000] Listening on :8080                            source=main.go:57
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0253] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0254] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0254] error reading request body: snappy: corrupt input  source=handler.go:20
ERRO[0254] error reading request body: snappy: corrupt input  source=handler.go:20
......

prometheus error

[root@master24 004-prometheus]# kubectl-kube-system logs -f prometheus-core-3816572233-93dw4
time="2017-09-08T15:32:55+08:00" level=info msg="Starting prometheus (version=1.7.1, branch=master, revision=3afb3fffa3a29c3de865e1172fb740442e9d0133)" source="main.go:88" 
time="2017-09-08T15:32:55+08:00" level=info msg="Build context (go=go1.8.3, user=root@0aa1b7fc430d, date=20170612-11:44:05)" source="main.go:89" 
time="2017-09-08T15:32:55+08:00" level=info msg="Host details (Linux 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86_64 prometheus-core-3816572233-93dw4 (none))" source="main.go:90" 
time="2017-09-08T15:32:55+08:00" level=info msg="Loading configuration file /etc/prometheus/prometheus.yaml" source="main.go:252" 
time="2017-09-08T15:32:55+08:00" level=info msg="Loading series map and head chunks..." source="storage.go:428" 
time="2017-09-08T15:32:55+08:00" level=info msg="22658 series loaded." source="storage.go:439" 
time="2017-09-08T15:32:55+08:00" level=info msg="Starting target manager..." source="targetmanager.go:63" 
time="2017-09-08T15:32:55+08:00" level=info msg="Listening on :9090" source="web.go:259" 
time="2017-09-08T15:32:55+08:00" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104" 
time="2017-09-08T15:32:55+08:00" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104" 
time="2017-09-08T15:32:55+08:00" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104" 
time="2017-09-08T15:32:55+08:00" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104" 
time="2017-09-08T15:32:56+08:00" level=warning msg="Error sending 100 samples to remote storage: server returned HTTP status 400 Bad Request: snappy: corrupt input" source="queue_manager.go:500" 
time="2017-09-08T15:32:56+08:00" level=warning msg="Error sending 100 samples to remote storage: server returned HTTP status 400 Bad Request: snappy: corrupt input" source="queue_manager.go:500" 
time="2017-09-08T15:32:56+08:00" level=warning msg="Error sending 100 samples to remote storage: server returned HTTP status 400 Bad Request: snappy: corrupt input" source="queue_manager.go:500" 
time="2017-09-08T15:32:56+08:00" level=warning msg="Error sending 100 samples to remote storage: server returned HTTP status 400 Bad Request: snappy: corrupt input" source="queue_manager.go:500" 
time="2017-09-08T15:32:56+08:00" level=warning msg="Error sending 100 samples to remote storage: server returned HTTP status 400 Bad Request: snappy: corrupt input" source="queue_manager.go:500" 
......

Environment: kubernetes 1.6.8 (I use prometheus in kubernetes) prometheus 1.7.1 chronix 0.5

prometheus config

[root@node23 data]# ps -elf | grep prometheus
4 S root     23168 23149 13  80   0 - 230031 futex_ 15:32 ?       00:01:40 /bin/prometheus -storage.local.retention=720h -config.file=/etc/prometheus/prometheus.yaml -alertmanager.url=http://alertmanager:9093/ -web.external-url=http://xxxxxx -storage.local.target-heap-size=1610612736
[root@master24 004-prometheus]# cat prometheus-core-cm.yaml
    global:
      scrape_interval: 10s
      scrape_timeout: 10s
      evaluation_interval: 10s
    remote_write:
      - url: http://xxxxxx:8080/ingest
    rule_files:
      - "/etc/prometheus-rules/*.rules"
    scrape_configs:

      # https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L37
      - job_name: 'kubernetes-nodes'
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        kubernetes_sd_configs:
          - role: node
        relabel_configs:
          - source_labels: [__address__]
            regex: '(.*):10250'
            replacement: '${1}:10255'
            target_label: __address__

      # https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L79
      - job_name: 'kubernetes-endpoints'
        kubernetes_sd_configs:
          - role: endpoints
        relabel_configs:
          - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
            action: replace
            target_label: __scheme__
            regex: (https?)
          - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
            action: replace
            target_label: __address__
            regex: (.+)(?::\d+);(\d+)
            replacement: $1:$2
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: kubernetes_namespace
          - source_labels: [__meta_kubernetes_service_name]
            action: replace
            target_label: kubernetes_name

      # https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L119
      - job_name: 'kubernetes-services'
        metrics_path: /probe
        params:
          module: [http_2xx]
        kubernetes_sd_configs:
          - role: service
        relabel_configs:
          - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
            action: keep
            regex: true
          - source_labels: [__address__]
            target_label: __param_target
          - target_label: __address__
            replacement: blackbox
          - source_labels: [__param_target]
            target_label: instance
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            target_label: kubernetes_namespace
          - source_labels: [__meta_kubernetes_service_name]
            target_label: kubernetes_name

      # https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L156
      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            action: replace
            regex: (.+):(?:\d+);(\d+)
            replacement: ${1}:${2}
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: kubernetes_namespace
          - source_labels: [__meta_kubernetes_pod_name]
            action: replace
            target_label: kubernetes_pod_name
          - source_labels: [__meta_kubernetes_pod_container_port_number]
            action: keep
            regex: 9\d{3}

chronix

[boomer@boomer chronix-solr-6.4.2]$ ./bin/solr start
Archiving 1 old GC log files to /home/boomer/Downloads/chronix-solr-6.4.2/server/logs/archived
Rotating solr logs, keeping a max of 9 generations
Waiting up to 180 seconds to see Solr running on port 8983 [-]  
Started Solr server on port 8983 (pid=16239). Happy searching!
FlorianLautenschlager commented 7 years ago

Hi thanks for your detailed issue. I will dig into it and let you know.

juliusv commented 7 years ago

Hi! Since Prometheus 1.7, details around the snappy compression in the remote write protocol have changed: https://github.com/prometheus/prometheus/pull/2696

We should update the ingester to the new snappy format. In case we need to preserve compatibility with older Prometheus versions, we could consider a transitional flag to enable optionally turning on the new or old format.