Open alkdese opened 1 year ago
Just to add, I'm following the guide in https://grafana.com/docs/loki/latest/installation/simple-scalable-helm, deploying to a k3s cluster, getting the same error:
logger=context traceID=00000000000000000000000000000000 userId=1 orgId=1 uname=admin t=2022-07-21T10:33:48.310911275Z level=error msg="Failed to call resource" error="no org id\n" traceID=00000000000000000000000000000000
logger=context traceID=00000000000000000000000000000000 userId=1 orgId=1 uname=admin t=2022-07-21T10:33:48.311108718Z level=error msg="Request Completed" method=GET path=/api/datasources/1/resources/labels status=500 remote_addr=127.0.0.1 time_ms=12 duration=12.696109ms size=83 referer=http://localhost:3000/datasources/edit/PV_DCbRVk traceID=00000000000000000000000000000000
When I try to be in local runtime , there is error , this my log:/var/log/messages
Aug 3 13:51:51 ip-0-0-0-00promtail-linux-amd64[426777]: level=error ts=2022-08-03T13:51:51.690301993Z caller=client.go:380 component=client host=localhost:3100 msg="final error sending batch" status=401 error="server returned HTTP status 401 Unauthorized **(401): no org id**"
my solution
I changed the configuration file /app/loki/loki.config.yaml auth_enabled: false Hope to help you, I personally feel this problem occurs in multi-user Settings
But now I can only use a single host
When I try to be in local runtime , there is error , this my log:/var/log/messages
Aug 3 13:51:51 ip-0-0-0-00promtail-linux-amd64[426777]: level=error ts=2022-08-03T13:51:51.690301993Z caller=client.go:380 component=client host=localhost:3100 msg="final error sending batch" status=401 error="server returned HTTP status 401 Unauthorized **(401): no org id**"
my solution
- I changed the configuration file /app/loki/loki.config.yaml auth_enabled: false Hope to help you, I personally feel this problem occurs in multi-user Settings But now I can only use a single host
Hi wubai7749, many thanks for your response. I will try it out
I had this error, but in my scenario I had a Promtail, Grafana and Loki in virtual machines. There was a firewall block of grafana reaching loki on port 3100.
I had this error , I deployed loki, promtail and grafana on windows 10, and encountered the same problem . these are config files. these files type actually yaml promtail-local-config.txt loki-local-config.txt
hi did u solve it?i also having the same issue,how did u solve it?thank u
Hi everyone just to add that I also have the same issue for weeks now
Also having same issue
I had the same issue namely I was getting first ...msg="Failed to call resource" error="no org id\n"...
by setting
loki:
auth_enabled: false
Promtailed stopped complaining but I still wasn't able to connect via grafana on http://loki-read:3100
(or http://loki-read-headless:3100
). I then saw a pretty picture here https://grafana.com/docs/loki/latest/getting-started/
where i saw minio playing a part turns out the reason I was getting Unable to fetch labels from Loki
in grafana was because Loki had nowhere to write them in the first place, enabling minio solved my last issue
minio:
enabled: true
(When quoting these yaml snippets here, I am refering to loki helm chart values.yaml
content)
I'm guessing im just exposing my stupidity here, but maybe this helps someone...
Hi, I had a similar issue. I was setting datasource in my values.yaml in helm config. That was working with my all in one loki because the url was http://loki:3100. But when i switch to production helm for loki, i needed to changed the endpoint to http://loki-loki-distributed-query-frontend:3100. I tried to change my previous loki datasource from ui instead of changing in values.yaml. And that was the issue, i can't overide the endpoint define in helm from grafana ui. So if you want to test just create a new datasource from ui with new url or change your values.yaml with good value. Not sure if it's a bug or something that grafana want. Hope it can help.
for me the problem came from updating the Loki Helm chart from 2.1.6.0
to 3.4.2
which silently enabled auth, so here the communication to http://loki:3100
was blocked as well. Coming from single binary mode I had to set up Loki chart like this:
loki:
commonConfig:
replication_factor: 1
auth_enabled: false
storage:
type: 'filesystem'
test:
enabled: false
monitoring:
selfMonitoring:
enabled: false
grafanaAgent:
installOperator: false
lokiCanary:
enabled: false
For me this happens when I enable the query-scheduler. It seems to work initially then after awhile the labels error reappears and then only when I remove the scheduler again by setting it to false does it work. This happens even if I change the datasource URL to the scheduler as described.
I ran into this issue when Loki is running in multi-tenant mode (same no org id
error as reported above). I solved it by adding an HTTP header in the datasource definition:
apiVersion: 1
datasources:
- name: loki
type: loki
access: proxy
url: http://loki-gateway
jsonData:
maxLines: 1000
httpHeaderName1: "X-Scope-OrgID"
secureJsonData:
httpHeaderValue1: "YOUR_TENANT_NAME"
Obviously this ties the datasource to a particular tenant, which might be a problem if you need some kind of multi-tenant dashboard.
I have the same error using the latest docker container of Grafana + Loki + Promtail on a Raspberry Pi 4 8Gb.
same error here. Unfortunately, the solutions provided are not working.
If you are running loki and promtail from docker then promtail try to rich localhost:3100 but this address isn't loki but promtail itself , you can try to put the name_of_docker_container:3100 and in the docker-compose file use same network
I also ran into this bug when following along with this tutorial.
This tutorial uses the grafana/loki-stack
helm chart. When configuring this datasource in the grafana dashboard, the expected url configuration of http://loki-stack/loki/api/v1/push
is unable to fetch labels from Loki with a (Failed to call resource) error.
This is because when promtail configures its url it is rendering as http://loki-gateway/loki/api/v1/push
. However, when the grafana/loki-stack
helm chart is installed, no loki-gateway
service is created; meaning no labels, no logs.
➜ kubernetes-infra git:(main) ✗ kubectl get -n loki secrets
NAME TYPE DATA AGE
loki-stack Opaque 1 3d17h
loki-stack-grafana Opaque 3 3d17h
loki-stack-promtail Opaque 1 3d17h
sh.helm.release.v1.loki-stack.v1 helm.sh/release.v1 1 3d17h
A work around for this bug is to create a loki-gateway
service, similar to the contents of the service/loki-stack.yaml
➜ kubernetes-infra git:(main) ✗ kubectl get -n loki service/loki-stack -o yaml
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: loki-stack
meta.helm.sh/release-namespace: loki
creationTimestamp: "2023-03-10T20:39:07Z"
labels:
app: loki
app.kubernetes.io/managed-by: Helm
chart: loki-2.16.0
heritage: Helm
release: loki-stack
name: loki-stack
namespace: loki
resourceVersion: "9386"
uid: 68c3acfd-d393-4289-b2c1-f631817a231a
spec:
clusterIP: 10.98.206.146
clusterIPs:
- 10.98.206.146
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http-metrics
port: 3100
protocol: TCP
targetPort: http-metrics
selector:
app: loki
release: loki-stack
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
Changing the release name to be loki-gateway
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: loki-stack
meta.helm.sh/release-namespace: loki
labels:
app: loki
app.kubernetes.io/managed-by: Helm
chart: loki-2.16.0
heritage: Helm
release: loki-stack
name: loki-gateway
namespace: loki
spec:
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http-metrics
port: 80
targetPort: 3100
protocol: TCP
targetPort: http-metrics
selector:
app: loki
release: loki-stack
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
and applying your manifest.
➜ kubernetes-infra git:(main) ✗ kubectl apply -f loki-fakeway.yaml
service/loki-gateway created
➜ kubernetes-infra git:(main) ✗ kubectl get -n loki service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
loki-gateway ClusterIP 10.103.1.160 <none> 80/TCP 4m47s
loki-stack ClusterIP 10.98.206.146 <none> 3100/TCP 3d18h
loki-stack-grafana ClusterIP 10.98.204.74 <none> 80/TCP 3d18h
loki-stack-headless ClusterIP None <none> 3100/TCP 3d18h
loki-stack-memberlist ClusterIP None <none> 7946/TCP 3d18h
Same issue with Docker swarm. I've set :
common:
instance_interface_names:
- "lo"
ring:
instance_interface_names:
- "lo"
And now it's working fine.
More info here : https://community.grafana.com/t/loki-error-on-port-9095-error-contacting-scheduler/67263
My silly mistake was setting Loki port in Grafana to 9096 instead of 3100. Therefore i had those same symptoms: no any errors, but "Unable to fetch labels from Loki (Failed to call resource)"
I think this should be closed since everything here is user error.
Same issue, loki 2.7.4
Actually i just ran into the same issue here even using the current docker based evaluation-build.
I can have additional logfiles coming via promtail into Loki with different tenant_ids - and I also can create different data-sources within the main-org using different values for X-Scope-OrgID. Switching between the different tenant-data-sources in Grafana based on the X-Scope-OrgID works well.
BUT
Where my problems start is when I create an additional Org in Grafana, switch to it, and try to add a new Loki datasource with an existent tenant/X-Scope-OrgID, which is working for the main org.
That is where i get the Unable to fetch labels from Loki (Failed to call resource), please check the server logs for more details
error.
Any hints about the reasons for this behaviour very much welcomed!
auth_enabled: false server: http_listen_port: 3100
common: instance_addr: 127.0.0.1 path_prefix: /data/loki/tmp/loki storage: filesystem: chunks_directory: /data/loki/tmp/loki/chunks rules_directory: /data/loki/tmp/loki/rules replication_factor: 1 instance_interface_names:
query_range: results_cache: cache: embedded_cache: enabled: true max_size_mb: 100
schema_config: configs:
from: 2020-10-24 store: boltdb-shipper objectstore: filesystem schema: v11 index: prefix: index period: 24h storage_config: boltdb: directory: /data/loki/data/index #自定义boltdb目录
filesystem: directory: /data/loki/data/chunks #自定义filesystem目录 limits_config: enforce_metric_name: false reject_old_samples: true reject_old_samples_max_age: 168h
chunk_store_config: max_look_back_period: 0s
table_manager: retention_deletes_enabled: false retention_period: 0s
ruler: alertmanager_url: http://127.0.0.1:9093
这样就可以了
I had a similar problem with Grafana cloud Loki, mine was funny when I discovered the error I made; so the error I made was putting the wrong user_id / username for the basic auth.
Putting this down so someone searching will crosscheck what they are doing and not suffer how I suffered lol.
same here loki v2.8.2
helm upgrade -i loki grafana/loki -n loki --create-namespace \
--set loki.auth_enabled=false
grafana v9.5.5
helm install loki-grafana grafana/grafana -n loki
add datasource in grafana
root@node1:~# kubectl -n loki get pods
NAME READY STATUS RESTARTS AGE
grafana-59896c5bd9-p5fbx 1/1 Running 0 14m
loki-backend-0 1/1 Running 0 24m
loki-backend-1 1/1 Running 0 24m
loki-backend-2 1/1 Running 0 24m
loki-canary-hj6w4 1/1 Running 0 24m
loki-canary-m5kmc 1/1 Running 0 24m
loki-canary-sp2vc 1/1 Running 0 24m
loki-gateway-7656594df4-jj8s2 1/1 Running 0 24m
loki-grafana-agent-operator-d7c684bf9-bqllv 1/1 Running 0 24m
loki-logs-f7t6d 2/2 Running 0 24m
loki-logs-h9c2x 2/2 Running 0 24m
loki-logs-n2s2j 2/2 Running 0 24m
loki-read-676f55dfb5-5ms5t 1/1 Running 0 24m
loki-read-676f55dfb5-7wjxt 1/1 Running 0 24m
loki-read-676f55dfb5-krwgf 1/1 Running 0 24m
loki-write-0 1/1 Running 0 24m
loki-write-1 1/1 Running 0 24m
loki-write-2 1/1 Running 0 24m
promtail-44mcz 1/1 Running 0 22m
promtail-d2mvd 1/1 Running 0 22m
promtail-wtr44 1/1 Running 0 22m
root@node1:~# kubectl -n loki get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana NodePort 10.96.3.219 <none> 80:30303/TCP 14m
loki-backend ClusterIP 10.96.0.108 <none> 3100/TCP,9095/TCP 24m
loki-backend-headless ClusterIP None <none> 3100/TCP,9095/TCP 24m
loki-canary ClusterIP 10.96.3.51 <none> 3500/TCP 24m
loki-gateway ClusterIP 10.96.3.111 <none> 80/TCP 24m
loki-memberlist ClusterIP None <none> 7946/TCP 24m
loki-read ClusterIP 10.96.1.191 <none> 3100/TCP,9095/TCP 24m
loki-read-headless ClusterIP None <none> 3100/TCP,9095/TCP 24m
loki-write ClusterIP 10.96.2.125 <none> 3100/TCP,9095/TCP 24m
loki-write-headless ClusterIP None <none> 3100/TCP,9095/TCP 24m
query-scheduler-discovery ClusterIP None <none> 3100/TCP,9095/TCP 24m
grafana logs
logger=migrator t=2023-07-21T15:54:45.867114757Z level=info msg="Executing migration" id="managed folder permissions alert actions repeated migration"
logger=migrator t=2023-07-21T15:54:45.868053631Z level=info msg="Executing migration" id="admin only folder/dashboard permission"
logger=migrator t=2023-07-21T15:54:45.869086831Z level=info msg="Executing migration" id="add action column to seed_assignment"
logger=migrator t=2023-07-21T15:54:45.871024811Z level=info msg="Executing migration" id="add scope column to seed_assignment"
logger=migrator t=2023-07-21T15:54:45.872894372Z level=info msg="Executing migration" id="remove unique index builtin_role_role_name before nullable update"
logger=migrator t=2023-07-21T15:54:45.874157795Z level=info msg="Executing migration" id="update seed_assignment role_name column to nullable"
logger=migrator t=2023-07-21T15:54:45.890743102Z level=info msg="Executing migration" id="add unique index builtin_role_name back"
logger=migrator t=2023-07-21T15:54:45.891919524Z level=info msg="Executing migration" id="add unique index builtin_role_action_scope"
logger=migrator t=2023-07-21T15:54:45.893327986Z level=info msg="Executing migration" id="add primary key to seed_assigment"
logger=migrator t=2023-07-21T15:54:45.89974253Z level=info msg="Executing migration" id="managed folder permissions alert actions repeated fixed migration"
logger=migrator t=2023-07-21T15:54:45.900684508Z level=info msg="Executing migration" id="migrate external alertmanagers to datsourcse"
logger=migrator t=2023-07-21T15:54:45.901714624Z level=info msg="Executing migration" id="create folder table"
logger=migrator t=2023-07-21T15:54:45.902784711Z level=info msg="Executing migration" id="Add index for parent_uid"
logger=migrator t=2023-07-21T15:54:45.904031771Z level=info msg="Executing migration" id="Add unique index for folder.uid and folder.org_id"
logger=migrator t=2023-07-21T15:54:45.905321192Z level=info msg="Executing migration" id="Update folder title length"
logger=migrator t=2023-07-21T15:54:45.906459355Z level=info msg="Executing migration" id="Add unique index for folder.title and folder.parent_uid"
logger=migrator t=2023-07-21T15:54:45.90821478Z level=info msg="migrations completed" performed=488 skipped=0 duration=727.126577ms
logger=sqlstore t=2023-07-21T15:54:45.921152241Z level=info msg="Created default admin" user=admin
logger=sqlstore t=2023-07-21T15:54:45.921327167Z level=info msg="Created default organization"
logger=secrets t=2023-07-21T15:54:45.922829266Z level=info msg="Envelope encryption state" enabled=true currentprovider=secretKey.v1
logger=local.finder t=2023-07-21T15:54:46.021716348Z level=warn msg="Skipping finding plugins as directory does not exist" path=/usr/share/grafana/plugins-bundled
logger=query_data t=2023-07-21T15:54:46.025529064Z level=info msg="Query Service initialization"
logger=live.push_http t=2023-07-21T15:54:46.035787305Z level=info msg="Live Push Gateway initialization"
logger=infra.usagestats.collector t=2023-07-21T15:54:47.711352397Z level=info msg="registering usage stat providers" usageStatsProvidersLen=2
logger=provisioning.alerting t=2023-07-21T15:54:47.711720587Z level=info msg="starting to provision alerting"
logger=provisioning.alerting t=2023-07-21T15:54:47.71174906Z level=info msg="finished to provision alerting"
logger=modules t=2023-07-21T15:54:47.711895706Z level=warn msg="No modules registered..."
logger=ngalert.state.manager t=2023-07-21T15:54:47.713533148Z level=info msg="Warming state cache for startup"
logger=http.server t=2023-07-21T15:54:47.714580227Z level=info msg="HTTP Server Listen" address=[::]:3000 protocol=http subUrl= socket=
logger=grafanaStorageLogger t=2023-07-21T15:54:47.714652127Z level=info msg="storage starting"
logger=ngalert.state.manager t=2023-07-21T15:54:47.724671377Z level=info msg="State cache has been initialized" states=0 duration=11.136564ms
logger=ngalert.multiorg.alertmanager t=2023-07-21T15:54:47.724771637Z level=info msg="starting MultiOrg Alertmanager"
logger=ticker t=2023-07-21T15:54:47.724813839Z level=info msg=starting first_tick=2023-07-21T15:54:50Z
logger=plugins.update.checker t=2023-07-21T15:54:48.626710034Z level=info msg="Update check succeeded" duration=912.013961ms
logger=context t=2023-07-21T15:55:16.228832772Z level=warn msg="failed to look up session from cookie" error="user token not found"
logger=context userId=0 orgId=0 uname= t=2023-07-21T15:55:16.229122814Z level=info msg="Request Completed" method=GET path=/ status=302 remote_addr=100.64.0.0 time_ms=0 duration=647.001µs size=29 referer= handler=/
logger=context t=2023-07-21T15:55:16.250520846Z level=warn msg="failed to look up session from cookie" error="user token not found"
logger=grafana.update.checker t=2023-07-21T15:55:17.715191991Z level=error msg="Update check failed" error="failed to get latest.json repo from github.com: Get \"https://raw.githubusercontent.com/grafana/grafana/main/latest.json\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" duration=30.000698592s
logger=context t=2023-07-21T15:55:20.6232838Z level=warn msg="failed to look up session from cookie" error="user token not found"
logger=http.server t=2023-07-21T15:55:20.638996344Z level=info msg="Successful Login" User=admin@localhost
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T15:55:21.035447302Z level=info msg="Request Completed" method=GET path=/api/live/ws status=-1 remote_addr=100.64.0.0 time_ms=1 duration=1.49527ms size=0 referer= handler=/api/live/ws
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T15:56:11.339910774Z level=error msg="Failed to call resource" error="Get \"http://loki-gateway/loki/api/v1/labels?start=1689954340694000000&end=1689954940694000000\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" traceID=
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T15:56:11.340013254Z level=error msg="Request Completed" method=GET path=/api/datasources/uid/e4437361-751e-4e51-b70a-9323f65af4a6/resources/labels status=500 remote_addr=100.64.0.0 time_ms=30001 duration=30.00187712s size=51 referer=http://192.168.72.30:30303/connections/your-connections/datasources/edit/e4437361-751e-4e51-b70a-9323f65af4a6 handler=/api/datasources/uid/:uid/resources/*
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T15:58:07.054165807Z level=error msg="Failed to call resource" error="Get \"http://loki-gateway/loki/api/v1/labels?start=1689954456404000000&end=1689955056404000000\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)" traceID=
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T15:58:07.054411233Z level=error msg="Request Completed" method=GET path=/api/datasources/uid/e4437361-751e-4e51-b70a-9323f65af4a6/resources/labels status=500 remote_addr=100.64.0.0 time_ms=30001 duration=30.001804298s size=51 referer=http://192.168.72.30:30303/connections/your-connections/datasources/edit/e4437361-751e-4e51-b70a-9323f65af4a6 handler=/api/datasources/uid/:uid/resources/*
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T15:59:04.641427885Z level=error msg="Failed to call resource" error="Get \"http://loki-gateway/loki/api/v1/labels?start=1689954513992000000&end=1689955113992000000\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)" traceID=
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T15:59:04.641541356Z level=error msg="Request Completed" method=GET path=/api/datasources/uid/e4437361-751e-4e51-b70a-9323f65af4a6/resources/labels status=500 remote_addr=100.64.0.0 time_ms=30001 duration=30.001837116s size=51 referer=http://192.168.72.30:30303/connections/your-connections/datasources/edit/e4437361-751e-4e51-b70a-9323f65af4a6 handler=/api/datasources/uid/:uid/resources/*
logger=cleanup t=2023-07-21T16:04:47.715188145Z level=info msg="Completed cleanup jobs" duration=1.59494ms
logger=plugins.update.checker t=2023-07-21T16:04:51.803641838Z level=info msg="Update check succeeded" duration=3.176766629s
logger=grafana.update.checker t=2023-07-21T16:05:18.073957922Z level=info msg="Update check succeeded" duration=358.262022ms
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T16:05:51.651429086Z level=error msg="Failed to call resource" error="Get \"http://loki-gateway.loki.svc.cluster.local/loki/api/v1/labels?start=1689954920997000000&end=1689955520997000000\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)" traceID=
logger=context userId=1 orgId=1 uname=admin t=2023-07-21T16:05:51.65158969Z level=error msg="Request Completed" method=GET path=/api/datasources/uid/e4437361-751e-4e51-b70a-9323f65af4a6/resources/labels status=500 remote_addr=100.64.0.0 time_ms=30001 duration=30.001583902s size=51 referer=http://192.168.72.30:30303/connections/your-connections/datasources/edit/e4437361-751e-4e51-b70a-9323f65af4a6 handler=/api/datasources/uid/:uid/resources/*
logger=sqlstore.transactions t=2023-07-21T16:09:46.050206076Z level=info msg="Database locked, sleeping then retrying" error="database is locked" retry=0 code="database is locked"
Maybe try:
im fixing the problem by uninstall loki with helm, and install the 2.16.0 loki version
helm uninstall loki -n monitoring
helm upgrade --install loki grafana/loki --values loki-values.yaml --version 2.16.0
Maybe try:
not work
我遇到了同样的问题,即我首先
...msg="Failed to call resource" error="no org id\n"...
通过设置loki: auth_enabled: false
Promtailed 停止抱怨,但我仍然无法通过
http://loki-read:3100
(或http://loki-read-headless:3100
) 上的 grafana 连接。然后我在这里看到了一张漂亮的图片,https://grafana.com/docs/loki/latest/getting-started/
我看到minio在扮演一个角色,结果我进入Unable to fetch labels from Loki
grafana的原因是因为Loki一开始就没有地方写它们,让minio解决了我的最后一个问题minio: enabled: true
(在这里引用这些 yaml 片段时,我指的是loki helm 图表
values.yaml
内容)我猜我只是在这里暴露我的愚蠢,但这也许对某人有帮助......
This really work for me with --set minio.enabled=true
.
helm install loki grafana/loki -n loki --create-namespace \
--set loki.auth_enabled=false \
--set minio.enabled=true \
--set write.replicas=1 \
--set read.replicas=1 \
--set backend.replicas=1 \
--set loki.commonConfig.replication_factor=1
root@ubuntu:~# kubectl -n loki get pods
NAME READY STATUS RESTARTS AGE
loki-backend-0 1/1 Running 0 46m
loki-canary-pdlpn 1/1 Running 0 46m
loki-gateway-5f5c6fdc9f-7qnmb 1/1 Running 0 46m
loki-grafana-agent-operator-d7c684bf9-zsnsk 1/1 Running 0 46m
loki-logs-2rwg2 2/2 Running 0 46m
loki-minio-0 1/1 Running 0 46m
loki-read-79d56b879d-qc57d 1/1 Running 0 46m
loki-write-0 1/1 Running 0 46m
promtail-sp4vj 1/1 Running 0 6m26s
root@ubuntu:~# kubectl -n grafana get pods
NAME READY STATUS RESTARTS AGE
grafana-7bc6dcdcbd-xzv8p 1/1 Running 0 42m
root@ubuntu:~#
Apart from all the other necessary fixes (auth_enabled:false and other new defaults), I also had to set the following when running without TLS, after updating my Loki version from helmchart 2.16.0 to 4.10.0.
gateway: # Expose Loki via the gateway
ingress:
enabled: true
hosts:
- host: "loki.something.custom"
paths:
- path: /
pathType: ImplementationSpecific
tls: [] # <-- HERE
Because the defaults of the new helmcharts defined gateway.loki.example.com with secret loki-gateway-tls. For some reason this "wrong" setup worked just fine on my testcluster, but started failing only on nonprod, probably having to do with the permissiveness of my ingress or proxy servers.
Describe the bug Installed loki using helm(steps below) and tried to add in grafana loki datasource using URL
http://loki-gateway.loki.svc.cluster.local
and get an error in UIUnable to fetch labels from Loki (Failed to call resource), please check the server logs for more details
To Reproduce Steps to reproduce the behavior:
loki
helm template loki ./loki-simple-scalable-1.7.4.tgz --include-crds --namespace loki --set monitoring.selfMonitoring.enabled=false --set loki.image.registry=our-org-acr-registry.azurecr.io --set loki.image.repository=docker.io/grafana/loki --set gateway.image.registry=our-org-acr-registry.azurecr.io --set gateway.image.repository=docker.io/nginxinc/nginx-unprivileged | sed -E -b "s/(image: )\"{0,1}(docker.io)(.*)/\1\"our-org-acr-registry.azurecr.io\/\2\3/g" | sed -E -b "s/(image: )(busybox)/\1our-org-acr-registry.azurecr.io\/docker\/\2/g" > loki-simple-scalable-our-org-no-selfmonitoring.yaml
helm template loki-grafana grafana-6.32.2.tgz --include-crds --set image.repository=our-org-acr-registry.azurecr.io/docker.io/grafana/grafana --set image.tag=9.0.2 --set testFramework.enabled=false > grafana2.yaml
k apply -f loki-simple-scalable-our-org-no-selfmonitoring.yaml k apply -f grafana2.yaml
logger=settings t=2022-07-20T05:25:29.679996556Z level=info msg="Starting Grafana" version=9.0.2 commit=0641b5d0cd branch=HEAD compiled=2022-06-28T11:24:40Z
logger=context traceID=00000000000000000000000000000000 userId=1 orgId=1 uname=admin t=2022-07-20T23:31:04.80762513Z level=error msg="Failed to call resource" error="no org id\n" traceID=00000000000000000000000000000000 logger=context traceID=00000000000000000000000000000000 userId=1 orgId=1 uname=admin t=2022-07-20T23:31:04.807691434Z level=error msg="Request Completed" method=GET path=/api/datasources/1/resources/labels status=500 remote_addr=127.0.0.1 time_ms=35 duration=36.00041ms size=83 referer=http://localhost:3000/datasources/edit/LVKYQTg4k traceID=00000000000000000000000000000000
$ k logs loki-read-0 | tail level=error ts=2022-07-21T00:21:58.060006943Z caller=reporter.go:203 msg="failed to delete corrupted cluster seed file, deleting it" err="NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" level=info ts=2022-07-21T00:22:04.613626607Z caller=table_manager.go:213 msg="syncing tables" level=info ts=2022-07-21T00:22:04.613687911Z caller=table_manager.go:252 msg="query readiness setup completed" duration=4.101µs distinct_users_len=0 level=info ts=2022-07-21T00:22:04.613720713Z caller=table_manager.go:229 msg="cleaning tables cache" level=error ts=2022-07-21T00:22:04.620100108Z caller=ruler.go:497 msg="unable to list rules" err="NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" level=error ts=2022-07-21T00:23:04.620057Z caller=ruler.go:497 msg="unable to list rules" err="NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" level=error ts=2022-07-21T00:24:04.620403286Z caller=ruler.go:497 msg="unable to list rules" err="NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" level=error ts=2022-07-21T00:25:04.619929426Z caller=ruler.go:497 msg="unable to list rules" err="NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" level=error ts=2022-07-21T00:25:39.349913343Z caller=reporter.go:203 msg="failed to delete corrupted cluster seed file, deleting it" err="NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" level=error ts=2022-07-21T00:26:04.620099239Z caller=ruler.go:497 msg="unable to list rules" err="NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"