apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.22k stars 185 forks source link

[Bug] Data source connected, but no labels received. Verify that Loki and Promtail is configured properly #6083

Open 0120208 opened 11 months ago

0120208 commented 11 months ago

Describe the bug I started loki, otel, and grafana using the “kbcli add on enable” command, and they ran normally. At present, the Loki prompt on the Grafana page reads "Data source connected, but no labels received. Verify that Loki and Promtail are configured properly."

To Reproduce Steps to reproduce the behavior:

  1. kbcli addon enable loki
  2. kbcli addon enable apecloud-otel-collector
  3. kbcli addon enable grafana
  4. kubectl edit svc -n kb-system kb-addon-grafana #(Change the type of service to NodePort) 5.Open the Grafana page using a browser through the exposed port to the outside world,but the page displays an error

Expected behavior Expected to see the log information collected by the otel

Screenshots image image image

Desktop (please complete the following information):

sophon-zt commented 11 months ago

Pls try:

kbcli addon disable apecloud-otel-collector
kbcli addon enable apecloud-otel-collector --set log.enabled=true
kbcli dashboard open kubeblocks-logs
0120208 commented 11 months ago

I have tried “set log. enabel=true” and executed the command "kbcli dashboard open kubeblocks logs", as shown in the picture. It stuck and did not automatically open the page as expected. after, I entered the IP: 3000 of the machine where Loki was deployed in the browser, but there was no content. I don't know what the problem is.

图片

sophon-zt commented 11 months ago

This is not important. This is a tool provided by kbcli to facilitate viewing the dashboard. Can you see the logs in your own way?

apecloud-otel-collector addon disable logs collection by default.

0120208 commented 11 months ago

OTEL's logs: otel.log

0120208 commented 11 months ago

I am not sure how Agamotto configured the URL and port of Loki. Is it a problem with the port here?

sophon-zt commented 11 months ago

Specify loki endpoint when installing addon:

kbcli addon enable apecloud-otel-collector --set log.enabled=true --set log.loki.endpoint=xxxxxxxxxxxxx

default using loki addon endpoint: http://loki-gateway/loki/api/v1/push

0120208 commented 11 months ago

Firstly, thank you for your reply. I have tried according to your instructions and there seems to be no change. I am currently unsure which step caused the problem. Let me provide a more detailed description of my actions:

1. kbcli addon enable loki 2. kbcli addon enable apecloud-otel-collector --set log.enabled=true --set log.loki.endpoint=http://loki-gateway/loki/api/v1/push 3. kbcli addon enable grafana 4. kubectl edit svc -n kb-system kb-addon-grafana [edit ClusterIP to NodePort] 5. kubectl get secret --namespace kb-system kb-addon-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

图片 图片 图片 图片 图片 图片

sophon-zt commented 11 months ago

Kubeblocks has a built-in loki datasource, which can be used directly.

image image

sophon-zt commented 11 months ago

or using kbcli dashboard subcommand: image

0120208 commented 11 months ago

Understood, but I found the container logs, it shows no data image

http://127.0.0.1:13100/” I tried to access this address, but the local command kept getting stuck and cannot access it image

0120208 commented 11 months ago

Are there any other solutions?

sophon-zt commented 11 months ago

Pls follow the steps below to troubleshoot

  1. Check if there is any abnormality in the otel addon

    $ k logs kb-addon-apecloud-otel-collector-6nmf9 -n kb-system --tail=10
    2023-12-19T03:19:05.331Z    info    filestorage/client.go:244   finished compaction {"kind": "extension", "name": "file_storage/agamotto", "directory": "/var/log/agamotto/receiver_filelog_dbs", "elapsed": 0.000311584}
    2023-12-19T03:19:05.331Z    warn    fileconsumer/file.go:63 no files match the configured include patterns  {"kind": "receiver", "name": "filelog/dbs", "data_type": "logs", "component": "fileconsumer", "include": ["/var/log/kubeblocks/**/**/*.log"], "exclude": []}
    2023-12-19T03:19:05.331Z    info    receivercreator@v0.77.0/observerhandler.go:101  starting receiver   {"kind": "receiver", "name": "receiver_creator", "data_type": "metrics", "name": "apecloudnode", "endpoint": "172.19.0.3", "endpoint_id": "apecloud_k8s_observer/k3d-demo-server-0/85a80ae4-1237-4a28-bb5a-181046cb6d8e"}
    2023-12-19T03:19:05.331Z    info    receivercreator@v0.77.0/observerhandler.go:101  starting receiver   {"kind": "receiver", "name": "receiver_creator", "data_type": "metrics", "name": "apecloudkubeletstats", "endpoint": "172.19.0.3", "endpoint_id": "apecloud_k8s_observer/k3d-demo-server-0/85a80ae4-1237-4a28-bb5a-181046cb6d8e"}
  2. Check whether the otel addon has enabled logs collection and whether the loki configuration is correct.

    $ k get cm kb-addon-apecloud-otel-collector -n kb-system -o jsonpath='{.data.config\.yaml}'
    ...
    // log receiver
    receivers:
    filelog/pods:
    include: [/var/log/pods/**/**/*.log]
    include_file_name: true
    include_file_path: true
    start_at: beginning
    storage: file_storage/agamotto
    resource:
    ...
    exporters:
    ...
    loki/pods:
    endpoint: http://loki-gateway/loki/api/v1/push
    sending_queue:
      enabled: true
  3. Check whether Loki endpoint is valid

    curl -v -H "Content-Type: application/json" -XPOST -s "http://loki-gateway/loki/api/v1/push" --data-raw  '{"streams": [{ "s
    tream": { "foo": "test" }, "values": [ [ "xxxxxxxxxx", "fizzbuzz" ] ] }]}'
  4. Check whether the Logs are pushed to lok:

    $ k logs loki-gateway-6d65bdc66b-r6q4m -n kb-system --tail=10
    10.42.0.1 - - [19/Dec/2023:03:27:02 +0000]  204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "Go-http-client/1.1" "-"
    10.42.0.1 - - [19/Dec/2023:03:27:03 +0000]  204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "Go-http-client/1.1" "-"
    10.42.0.1 - - [19/Dec/2023:03:27:03 +0000]  204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "Go-http-client/1.1" "-"
    10.42.0.1 - - [19/Dec/2023:03:27:03 +0000]  204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "Go-http-client/1.1" "-"
    10.42.0.1 - - [19/Dec/2023:03:27:03 +0000]  204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "Go-http-client/1.1" "-"
    10.42.0.1 - - [19/Dec/2023:03:27:03 +0000]  204 "POST /loki/api/v1/push HTTP/1.1" 0 "-" "Go-http-client/1.1" "-"
  5. Check whether Loki is abnormal

    $ k logs kb-addon-loki-0 -n kb-system --tail=10
    level=info ts=2023-12-19T03:25:18.618733801Z caller=expiration.go:78 msg="overall smallest retention period 1702697118.618, default smallest retention period 1702697118.618"
    level=info ts=2023-12-19T03:25:18.618825676Z caller=marker.go:202 msg="no marks file found"
    ts=2023-12-19T03:25:18.618845343Z caller=spanlogger.go:85 level=info msg="building index list cache"
    ts=2023-12-19T03:25:18.618950301Z caller=spanlogger.go:85 level=info msg="index list cache built" duration=95.792µs
    level=info ts=2023-12-19T03:26:12.21864609Z caller=table_manager.go:166 msg="handing over indexes to shipper"
    level=info ts=2023-12-19T03:26:12.219828131Z caller=table_manager.go:134 msg="uploading tables"
    level=info ts=2023-12-19T03:26:18.617326718Z caller=marker.go:202 msg="no marks file found"
    level=info ts=2023-12-19T03:27:12.21461584Z caller=table_manager.go:166 msg="handing over indexes to shipper"
    level=info ts=2023-12-19T03:27:12.214635256Z caller=table_manager.go:134 msg="uploading tables"
    level=info ts=2023-12-19T03:27:18.620332551Z caller=marker.go:202 msg="no marks file found"
  6. Check whether loki server have received data by metrics

    # curl -s  '10.43.22.222:3100/metrics' |grep -i pushes_bytes
    # HELP loki_memberlist_client_state_pushes_bytes_total Total size of pushed state
    # TYPE loki_memberlist_client_state_pushes_bytes_total counter
    loki_memberlist_client_state_pushes_bytes_total 2656
  7. If you can go here, it’s a problem with grafana configuration. Please refer to grafana official documentation:

0120208 commented 11 months ago

I followed your instructions to the third step and encountered an issue where I couldn't resolve the host name loki gateway. Then, I tried to expose the port to the outside world and accessed the address using IP: port, and obtained the following information.

image

logs:

image

github-actions[bot] commented 10 months ago

This issue has been marked as stale because it has been open for 30 days with no activity