Open flokli opened 4 months ago
It seems it isn't even possible to enable IPv6 for all components. Setting frontend.instance-interface-names=mycelium
to that interface simply results in an unrecoverable startup error.
@periklis @shwetaap I definitely don't understand why IPv6 is still broken on the mainstream versions.
I use IPv6 everyday on a fullstack IPv6 environement (no IPv4 possible) since April 2023. And unfortunately I can't update Loki anymore, I'm stuck using the custom docker image quay.io/shwetaap/loki:dev
. I can share my config if that could help.
auth_enabled: true
common:
compactor_address: http://loki-loki-distributed-compactor:3100
distributor:
ring:
instance_addr: loki-loki-distributed-distributor
kvstore:
store: memberlist
frontend:
compress_responses: true
log_queries_longer_than: 20s
tail_proxy_url: http://loki-loki-distributed-querier:3100
frontend_worker:
frontend_address: loki-loki-distributed-query-frontend-headless:9095
index_gateway:
mode: simple
ingester:
lifecycler:
enable_inet6: true
ring:
kvstore:
store: memberlist
replication_factor: 2
wal:
dir: /var/loki/wal
enabled: true
replay_memory_ceiling: 1g
memberlist:
bind_addr:
- '::'
join_members:
- loki-loki-distributed-memberlist
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
ttl: 24h
query_scheduler:
use_scheduler_ring: true
ruler:
enable_alertmanager_discovery: false
enable_api: true
enable_sharding: false
evaluation_interval: 1m
poll_interval: 1m
ring:
kvstore:
store: memberlist
rule_path: /tmp/loki/scratch
storage:
local:
directory: /etc/loki/rules
type: local
server:
grpc_listen_address: '[::0]'
grpc_listen_port: 9095
grpc_server_max_recv_msg_size: 10485760
http_listen_address: '[::0]'
http_listen_port: 3100
http_server_read_timeout: 300s
http_server_write_timeout: 300s
log_level: info
I removed some urevelant parts of the config, so don't use it exactly as is, but it illustrates the "tricks" made to work with the custom image.
The GRPC binding is explicit grpc_listen_address: '[::0]'
, the memberlist.bind_addr: ['::']
too else the components can't conect to each others.
Outside of that 2 configs there is leteraly nothing specific.
@flokli AFAIU on each machine you run Loki on, you have a single IPv6 interface right? But you want your all-in-one Loki to use IPv4 (i assume lo device here?). However Loki is picking up the IPv6 address instead? Can you post your Loki Config as well as a listing of the interfaces on your machine? Btw this is the FinalAdvertiseAddr function responsible for picking up addresses from your interfaces, maybe you can quickly review if your machine setup runs here in any missing edge case.
@gillg I don't quite understand, we have been running Loki in production with IPv6-only as well as IPv4/IPv6 dual stack kubernetes/openshift clusters for over a year now using GA versions (from 2.8 till 3.1.0 IIRC). The three particular config options for us are documented in this test (FYI the HASH_RING_INSTANCE_ADDR
env var is always picked up from .status.podIP
for simplicity reasons).
Highlighting the particular config settings in this test:
common:
ring:
instance_addr: ${HASH_RING_INSTANCE_ADDR}
ingester:
lifecycler:
enable_inet6: true
memberlist:
advertise_addr: ${HASH_RING_INSTANCE_ADDR}
The last setting is an issue that we have when merging the common ring config, but we deemed this is ok because not everybody used memberlist before Loki 3.x (e.g. we have users running the ring over Consul or etcd). With Loki 3.x this might be a small improvement to squeeze the configuration down to two knobs.
@flokli AFAIU on each machine you run Loki on, you have a single IPv6 interface right? But you want your all-in-one Loki to use IPv4 (i assume lo device here?). However Loki is picking up the IPv6 address instead? Can you post your Loki Config as well as a listing of the interfaces on your machine? Btw this is the FinalAdvertiseAddr function responsible for picking up addresses from your interfaces, maybe you can quickly review if your machine setup runs here in any missing edge case.
The machines have multiple network interfaces, and various combination of v4/v6 or both on them. They're deployed in different locations, with various changing IPs, and that's why I'd like to have cluster gossip communication to happen via an (encrypted and authenticated) overlay network. This network provides a network interface with only IPv6 addresses on it (in this case, one address in the ULA range, and a IPv6 link-local address). All IPs are stable, derived from key material, which makes authenticating various nodes really only a matter of whitelisting IPs ;-)
So I wanted to have Loki listen on that overlay network interface. Usually, the instance_interface_names
config options are used for this, but without also setting enable_ipv6
for each of these, the discovery code is unable to find an IP to pick. The global default also didn't seem to have an effect. I think we should flip the default for enable_ipv6
on all these.
As written in https://github.com/grafana/loki/issues/13416#issuecomment-2209182343, for the frontend it's not even possible to enable IPv6 currently.
My config currently looks like this (not enabling IPv6 for the frontend, explicitly adding instance_enable_ipv6
in many different places):
{
"common": {
"instance_interface_names": [
"mycelium"
],
"path_prefix": "/var/lib/loki",
"replication_factor": 1,
"ring": {
"instance_enable_ipv6": true,
"instance_interface_names": [
"mycelium"
],
"kvstore": {
"store": "memberlist"
}
}
},
"compactor": {
"compactor_ring": {
"instance_enable_ipv6": true,
"instance_interface_names": [
"mycelium"
]
}
},
"distributor": {
"ring": {
"instance_enable_ipv6": true,
"instance_interface_names": [
"mycelium"
]
}
},
"frontend": {
"instance_interface_names": [
"end0"
]
},
"index_gateway": {
"ring": {
"instance_enable_ipv6": true
}
},
"ingester": {
"lifecycler": {
"enable_inet6": true
}
},
"memberlist": {
"join_members": [
"node1.<redacted>", # only AAAA record for these
"node2.<redacted>",
"node3.<redacted>"
]
},
"query_scheduler": {
"scheduler_ring": {
"instance_enable_ipv6": true
}
},
"ruler": {
"ring": {
"instance_enable_ipv6": true
}
},
"schema_config": {
"configs": [
{
"from": "2020-07-01",
"index": {
"period": "24h",
"prefix": "index_"
},
"object_store": "s3",
"schema": "v13",
"store": "tsdb"
}
]
},
"server": {
"http_listen_port": 3100
},
"storage_config": {
"aws": {
"access_key_id": "${AWS_ACCESS_KEY_ID}",
"bucketnames": "logs",
"endpoint": "https://s3.<redacted>",
"region": "garage",
"s3": "s3://logs",
"secret_access_key": "${AWS_SECRET_ACCESS_KEY}"
},
"tsdb_shipper": {
"active_index_directory": "/var/lib/loki/index",
"cache_location": "/var/lib/loki/index_cache"
}
}
}
Is your feature request related to a problem? Please describe. I got bitten by Loki disabling IPv6 everywhere by default.
I wanted to do a all-in-one deployment on 3 individual machines, which share a IPv6-only network interface.
I configured it to pick IP addresses from that interface (via
common/ring/instance_interface_names
), and got greeted by theerror message.
After some searching, I discovered Loki requires you to enable IPv6 explicitly for each and every component,and is disabled by default.
https://github.com/grafana/loki/pull/10650 provides a snippet that enables IPv6 in every individual component. It's quite a bit of work, in 2024 I wouldn't expect something to come with IPv6 disabled by default, especially with the rise of IPv6-only deployments (be it Kubernetes or outside).
Describe the solution you'd like
_enable_ipv6
/_enable_inet6
options, and require people to explicitly set things to false if they want IPv6 disabled.Describe alternatives you've considered More prominently documenting at least 27 additional lines of config are required to make loki ring discovery work in an IPv6-only environment.
Additional context https://github.com/grafana/loki/pull/10650
cc @matthewpi @periklis @leahoswald @rfratto