grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.51k stars 3.4k forks source link

[FeatureRequest] S3 Endpoint and DNS RoundRobin load balancing #9239

Open timansky opened 1 year ago

timansky commented 1 year ago

HI. I've noticed that loki does not load balance requests to S3 and uses only first ip from resolve. S3 can be under multiple IP addresses for RR load balancing.

Loki: 2.8.0

liguozhong commented 1 year ago

I deployed a minio, I know this problem, I try to solve this problem by introducing a minio proxy, it will be helpful if you describe your environment in more detail

frittentheke commented 1 year ago

At least Promtail does something like this (use multiple IPs to a hostname) to redundantly reach Loki (but this is just a side effect of using https://pkg.go.dev/net/http#RoundTripper, https://github.com/grafana/loki/issues/3301#issuecomment-961751139)

It would be nice if all HTTP clients in the Loki/Promtail stack would allow to balance over multiple endpoints (either multiple A or AAAA responses or a static list given via config), if you like check out my issue about adding such functionality: https://github.com/grafana/loki/issues/3301.

timansky commented 1 year ago

@liguozhong our problem is simple we are using multiple A records for horizontal scale of S3 Frontend. (There is a lot of reasons why: Network throughput, connections count, etc) Even if we deploy single proxy, this proxy will be possible point of failure.

PS Also there is no option for DNS lookup, like it is made in memcache and scheduler. Reason is same: IP can be changed. This will be more critical if proxy is made as service.

kind: Endpoints
apiVersion: v1
metadata:
  name: s3-proxy
  namespace: bar
subsets:
  - addresses:
      - ip: xx.xx.xx.01
      - ip: xx.xx.xx.02
      - ip: xx.xx.xx.03
    ports:
      - name: http
        port: 8080
liguozhong commented 1 year ago

If more loki users also need the feature of minio proxy to do multiple s3 endpoints, I think I can complete a PR to realize this

frittentheke commented 1 year ago

If more loki users also need the feature of minio proxy to do multiple s3 endpoints, I think I can complete a PR to realize this

If you look at https://github.com/grafana/loki/blob/46b7d92ecf026263dcf7ab5032ae81249f1ff85c/pkg/storage/chunk/client/aws/s3_storage_client.go#L275 you see that for S3 there also is the HTTP round-tripper used.

So whatever solution you implement to add round-robin requests (see discussion at https://github.com/golang/go/issues/34511) would also be applicable to Promtail being able to do just the same to distribute uploads to multiple instances of Loki (https://github.com/grafana/loki/issues/3301)

liguozhong commented 1 year ago

image https://github.com/minio/sidekick have you tried this project?

timansky commented 1 year ago

It is the same solution as mentioned at first reply. It has same problems. Single point of failure, and this variant is working only with kubernetes installation.

MarcinPrzadlo commented 1 year ago

Using promtail-2.8.3 rpm version (build date 2023-07-21) to ship logs to DNS round-robin hostname (that returns 2 A ip addresses). When one host is down, promtail is not re-connecting to second ip.

Is this RoundTripper implemented yet (or a timeline it will)?

I would rather ballance it with DNS rather than load-balancer device not to overload load-balancer with actual heavy logs volume.

frittentheke commented 1 year ago

Using promtail-2.8.3 rpm version (build date 2023-07-21) to ship logs to DNS round-robin hostname (that returns 2 A ip addresses). When one host is down, promtail is not re-connecting to second ip.

Is this RoundTripper implemented yet (or a timeline it will)?

@MarcinPrzadlo You mean the ability to use multiple Loki targets (IPs, hostnames) to then loadbalance the requests to? That's what I suggested via https://github.com/grafana/loki/issues/3301, but it's not there yet. You already found that it can indeed use a single hostname with multiple IPs (multiple A or AAAA responses) which then serve as a fail-over target. But they are not all used and they have to be behind a single hostname.

I would rather ballance it with DNS rather than load-balancer device not to overload load-balancer with actual heavy logs volume.

DNS (as in multiple IPs for one hostname) or by being able to configure multiple targets in Promtail. The latter would be much more flexible as one does not have to create and manage a DNS record with multiple IPs, but I could simply use multiple distinct targets (IPs, hostnames). The missing feature is, as again I suggested via https://github.com/grafana/loki/issues/3301, the loadbalacing / distribution of requests among those targets.

timansky commented 10 months ago

keepalive