Open darrikmazey opened 2 years ago
Hi,
In container, it's not possible du get the device name for specifying it in cfg file... is it possible to use a wildcard or use all availables interfaces ?
That's a very complicated problem because based on some internal discussion that we had, there are scenarios where users might be badly affected by using an unwanted interface and scenarios where users are badly affected by not using a wanted interface, so picking a default that works for everyone is not an easy task.
That said, we are focusing instead on improving the configuration experience. For that, we are adding a way to configure the net interface to be used by Loki in a single place, inside the common
configuration section.
For context, the main problem with the current scenario is that when you configure common: ring: interface_names
, you might fall into the trap of thinking Loki will use the defined interface_names everywhere. But it will actually only use it for ring communication, not for other components (ex: the frontend, which isn't a ring). With this new configuration, this is solved, as what is defined there will be used by all Loki components.
Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale
label sorted by thumbs up.
We may also:
revivable
if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).keepalive
label to silence the stalebot if the issue is very common/popular/important.We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.
still present and still searching for a workaround in full containerized deployements
I believe it is solved; you only have to use the common section directly instead of common/ring section:
common:
- ring:
- interface_names:
- - ens5
+ interface_names:
+ - ens5
Yes, this work in a bare metal setup where it's easy to get the net devname. But in fully containerized like kub or swarm env, with multiple networks.... ? howto ?
Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale
label sorted by thumbs up.
We may also:
revivable
if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).keepalive
label to silence the stalebot if the issue is very common/popular/important.We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.
I'll try in fews days if 2.5 release help when deploying in containers swarm and kube.
Sooo... any solution for EKS on IPv6 for this issue? I've tried setting ::1 manually,
loki:
commonConfig:
ring:
instance_addr: ::1
but then I'm getting
level=info ts=2024-06-04T11:19:46.231869827Z caller=main.go:120 msg="Starting Loki" version="(version=3.0.0, branch=HEAD, revision=b4f7181c7a)"
level=info ts=2024-06-04T11:19:46.232898195Z caller=server.go:354 msg="server listening on addresses" http=[::]:3100 grpc=[::]:9095
level=info ts=2024-06-04T11:19:46.233751682Z caller=modules.go:730 component=bloomstore msg="no metas cache configured"
level=info ts=2024-06-04T11:19:46.233866322Z caller=blockscache.go:420 component=bloomstore msg="run ttl evict job"
level=info ts=2024-06-04T11:19:46.233914608Z caller=blockscache.go:380 component=bloomstore msg="run lru evict job"
level=info ts=2024-06-04T11:19:46.233925012Z caller=blockscache.go:365 component=bloomstore msg="run metrics collect job"
level=info ts=2024-06-04T11:19:46.243431409Z caller=table_manager.go:273 index-store=tsdb-2024-04-01 msg="query readiness setup completed" duration=3.134µs distinct_users_len=0 distinct_users=
level=info ts=2024-06-04T11:19:46.243486727Z caller=shipper.go:160 index-store=tsdb-2024-04-01 msg="starting index shipper in RO mode"
level=info ts=2024-06-04T11:19:46.244504387Z caller=mapper.go:47 msg="cleaning up mapped rules directory" path=/var/loki/rules-temp
level=info ts=2024-06-04T11:19:46.24931573Z caller=module_service.go:82 msg=starting module=server
level=info ts=2024-06-04T11:19:46.249441996Z caller=module_service.go:82 msg=starting module=analytics
level=info ts=2024-06-04T11:19:46.249450742Z caller=module_service.go:82 msg=starting module=runtime-config
level=info ts=2024-06-04T11:19:46.249718437Z caller=module_service.go:82 msg=starting module=bloom-store
level=info ts=2024-06-04T11:19:46.249785472Z caller=module_service.go:82 msg=starting module=memberlist-kv
level=info ts=2024-06-04T11:19:46.249803202Z caller=module_service.go:82 msg=starting module=index-gateway-ring
level=info ts=2024-06-04T11:19:46.249934334Z caller=module_service.go:82 msg=starting module=compactor
level=info ts=2024-06-04T11:19:46.250116788Z caller=module_service.go:82 msg=starting module=query-scheduler-ring
level=info ts=2024-06-04T11:19:46.250202472Z caller=module_service.go:82 msg=starting module=ring
level=error ts=2024-06-04T11:19:46.250916073Z caller=loki.go:519 msg="module failed" module=ring error="starting module ring: invalid service state: Failed, expected: Running, failure: unable to initialise ring state: Get \"http://localhost:8500/v1/kv/collectors/ring?stale=\": dial tcp [::1]:8500: connect: connection refused"
Sooo... any solution for EKS on IPv6 for this issue? I've tried setting ::1 manually,
loki: commonConfig: ring: instance_addr: ::1
but then I'm getting
level=info ts=2024-06-04T11:19:46.231869827Z caller=main.go:120 msg="Starting Loki" version="(version=3.0.0, branch=HEAD, revision=b4f7181c7a)" level=info ts=2024-06-04T11:19:46.232898195Z caller=server.go:354 msg="server listening on addresses" http=[::]:3100 grpc=[::]:9095 level=info ts=2024-06-04T11:19:46.233751682Z caller=modules.go:730 component=bloomstore msg="no metas cache configured" level=info ts=2024-06-04T11:19:46.233866322Z caller=blockscache.go:420 component=bloomstore msg="run ttl evict job" level=info ts=2024-06-04T11:19:46.233914608Z caller=blockscache.go:380 component=bloomstore msg="run lru evict job" level=info ts=2024-06-04T11:19:46.233925012Z caller=blockscache.go:365 component=bloomstore msg="run metrics collect job" level=info ts=2024-06-04T11:19:46.243431409Z caller=table_manager.go:273 index-store=tsdb-2024-04-01 msg="query readiness setup completed" duration=3.134µs distinct_users_len=0 distinct_users= level=info ts=2024-06-04T11:19:46.243486727Z caller=shipper.go:160 index-store=tsdb-2024-04-01 msg="starting index shipper in RO mode" level=info ts=2024-06-04T11:19:46.244504387Z caller=mapper.go:47 msg="cleaning up mapped rules directory" path=/var/loki/rules-temp level=info ts=2024-06-04T11:19:46.24931573Z caller=module_service.go:82 msg=starting module=server level=info ts=2024-06-04T11:19:46.249441996Z caller=module_service.go:82 msg=starting module=analytics level=info ts=2024-06-04T11:19:46.249450742Z caller=module_service.go:82 msg=starting module=runtime-config level=info ts=2024-06-04T11:19:46.249718437Z caller=module_service.go:82 msg=starting module=bloom-store level=info ts=2024-06-04T11:19:46.249785472Z caller=module_service.go:82 msg=starting module=memberlist-kv level=info ts=2024-06-04T11:19:46.249803202Z caller=module_service.go:82 msg=starting module=index-gateway-ring level=info ts=2024-06-04T11:19:46.249934334Z caller=module_service.go:82 msg=starting module=compactor level=info ts=2024-06-04T11:19:46.250116788Z caller=module_service.go:82 msg=starting module=query-scheduler-ring level=info ts=2024-06-04T11:19:46.250202472Z caller=module_service.go:82 msg=starting module=ring level=error ts=2024-06-04T11:19:46.250916073Z caller=loki.go:519 msg="module failed" module=ring error="starting module ring: invalid service state: Failed, expected: Running, failure: unable to initialise ring state: Get \"http://localhost:8500/v1/kv/collectors/ring?stale=\": dial tcp [::1]:8500: connect: connection refused"
i remember someone having success on AWS by adding 127.0.0.1
to the instance_addr. Have you tried that?
Regarding the error you're facing now:
level=error ts=2024-06-04T11:19:46.250916073Z caller=loki.go:519 msg="module failed" module=ring error="starting module ring: invalid service state: Failed, expected: Running, failure: unable to initialise ring state: Get \"http://localhost:8500/v1/kv/collectors/ring?stale=\": dial tcp [::1]:8500: connect: connection refused"
It is using port 8500, that seems wrong. Memberlist by default runs on a different port. I suggest you to make sure your components are serving memberlist on the same port you're configuring its usage/client.
@DylanGuedes thanks for the feedback. I have eventually realized that loki can't get the IP in k8s from the interface, because (a) eh0 and eth0 are not in k8s pods, but also (b) even if I provide interfaces that seem to exist in pod it still fails to acquire an IP address. So, instance_addr
set to ::0
in my case IPv6 coz cluster is IPv6 works.
Now, port 8500 seems to be either etcd or consul. Seems like loki is trying to use these services for ring configuration. I had to manually set kvstore.store
to memberlist
.
Eventually, I realized that for my use case I can just run SingleBinary
mode and have 2 replicas. Here is the config that works for me.
deploymentMode: SingleBinary
loki:
auth_enabled: false
commonConfig:
ring:
instance_addr: "::0"
kvstore:
store: inmemory
replication_factor: 1
path_prefix: /var/loki
server:
http_listen_port: 3100
grpc_listen_port: 9095
schemaConfig:
configs:
- from: 2024-04-01
store: tsdb
object_store: s3
schema: v13
index:
prefix: loki_index_
period: 24h
ingester:
chunk_encoding: snappy
tracing:
enabled: true
# querier:
# # Default is 4, if you have enough memory and CPU you can increase, reduce if OOMing
# max_concurrent: 4
storage:
type: 's3'
bucketNames:
chunks: loki-logs-.....
ruler: loki-logs-.....
admin: loki-logs-.....
s3:
region: us-east-1
gateway:
enabled: true
replicas: 1
resources:
limits:
memory: 96Mi
singleBinary:
replicas: 2
autoscaling:
enabled: true
persistence:
enabled: true
size: 4Gi
storageClass: io2
limits:
memory: 256Mi
read:
replicas: 0
backend:
replicas: 0
write:
replicas: 0
chunksCache:
enabled: true
resultsCache:
enabled: true
lokiCanary:
enabled: false
resources:
limits:
memory: 32Mi
test:
enabled: false
loki.commonConfig.ring.instance_addr
to local IP (127.0.0.1 or ::1 for IPv6) loki.commonConfig.ring.kvstore.store=memberlist
kvstore.store
there is also Loki member_list
config: https://github.com/grafana/loki/blob/main/production/helm/loki/values.yaml#L138-L152But, as far as I understand, if you don't have many logs (saw 100GB/day somewhere – unverified), you can use SingleBinary mode to make things easier.
Also chunksCache
and resultsCache
are simple memcached
services; however, they request a looooot of ram by default, so... pointing out.
If you're using SingleBinary, you also need to set replication_factor
to 1. Otherwise Loki will complain that there is not enough replicas.
Describe the bug A minimal config for SSD fails if the network interface is not either eth0 or en0, causing services to bind to lo instead.
To Reproduce Steps to reproduce the behavior:
Expected behavior Expected services to bind to private IP address.
Environment:
Screenshots, Promtail config, or terminal output On startup logs showed:
/config
showed:This was rectified by adding the following to configs: