Open sourcehawk opened 1 year ago
This is my current configuration for s3. You can (and should) use the same bucket when using boltdb-shipper or tsdb
storage: {
bucketNames: {
chunks: bucket.bucketName,
ruler: bucket.bucketName,
admin: bucket.bucketName,
},
type: "s3",
s3: {
region: "us-east-1",
},
},
insecure:false just forces https afaik.
To make sure your objects have encryption, enable it on the bucket
Why is it that when I ask about the values.yaml file noone actually replies with an example for the values.yaml file?
Anyhow, after two days of going bold trying to figure this out, here is how others can use the actual helm chart the way it was set up to be used. Found the value for storage.s3.s3
by pure luck.
loki:
# Set to false if you don't intend to set 'X-Scope-OrgID' header for the loki datasource
auth_enabled: true
# You don't need this if you are not using the auth header
querier:
# I am using one tenant id per cluster
# This config is for cluster level logging
# Another centralized logging cluster will use all tenant ids in header and multi `true`
# See https://grafana.com/docs/loki/latest/operations/multi-tenancy/
multi_tenant_queries_enabled: false
# S3 storage configuration
storage:
type: s3
bucketNames:
chunks: "<loki-bucketname>"
ruler: "<loki-bucketname>"
admin: "<loki-bucketname>"
s3:
# Not really sure why this is required at all but it works only if this is provided
s3: "s3://<loki-bucketname>"
# Endpoints: https://docs.aws.amazon.com/general/latest/gr/s3.html
endpoint: "s3.eu-west-1.amazonaws.com"
# AWS region of bucket
region: "eu-west-1"
# Secret access key for a user with bucket permissions
secretAccessKey: ""
# Access key for a user with bucket permissions
accessKeyId: ""
# Set to false, multiple posts about problems with it set to true
s3ForcePathStyle: false
# We want to use the HTTPS only endpoint so set to false
insecure: false
# Our bucket is SSE-3 encrypted
sse_encryption: true
# Timeouts etc
http_config: {}
After getting the configuration to actually work, the loki datasource is available via http://loki-gateway
. I added it to grafana with a configmap that looks like this. Note that this requires the grafana sidecar for datasources to be enabled with the label grafana_datasource: "1"
.
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasource-loki
labels:
grafana_datasource: "1"
data:
datasource.yaml: |-
apiVersion: 1
datasources:
- name: loki
type: loki
url: "http://loki-gateway"
# You only need this if you have auth enabled
jsonData:
httpHeaderName1: 'X-Scope-OrgID'
secureJsonData:
httpHeaderValue1: '<THE_TENANT_ID_FOR_CLUSTER>'
As for grafana/promtail, this values.yaml configuration did the trick
config:
clients:
- url: http://loki-gateway/loki/api/v1/push
# You only need this if you have auth enabled
tenant_id: "<THE_TENANT_ID_FOR_CLUSTER>"
I've come to realize that for some reason I am only able to get it to work using the non-https only endpoint i.e s3.eu-west-1.amazonaws.com
instead of s3-accesspoint.eu-west-1.amazonaws.com
Agreed that S3 documentation is lacking and there appears to be a conflict between the s3
and endpoint
S3 properties. Following the advice of https://github.com/grafana/loki/issues/7279#issuecomment-1291488556 was a big part of fixing my problem.
I agree that the documentation can be confusing. My understanding is that chunks S3 bucket will store the indexes and data
Currently, I don't know what ruler or admin is used for or why we need to set them. I haven't seen anything written into those buckets yet on my deployment.
Also, when I deploy the grafana/loki Helm chart, it sets the default number of backend, read, and write pod replicas to 3. It will create additional PVCs for the backend and write pods. Why do I need those additional EBS volumes?
+1
+1
+1
@sourcehawk's example was very useful, thank you! Here is what I ended up with as a series of ansible tasks deploying the helm charts (the values.yaml
content can be found under values:
for each task, should be fairly obvious). Notes:
system
aa
and bb
. These are parsed from the pod's namespace (i.e. aa-foo-project
, bb-bar-project
) and set as a tenant
label, and also set the tenant
value as is sent from promtail
to loki
(via X-Scope-OrgID
)I hope this helps some people!
---
- name: Add grafana chart repo
kubernetes.core.helm_repository:
name: grafana
repo_url: "https://grafana.github.io/helm-charts"
- name: Deploy loki
vars:
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 100m
memory: 64Mi
kubernetes.core.helm:
name: loki
release_namespace: grafana-system
create_namespace: yes
chart_ref: grafana/loki
# https://github.com/grafana/loki/blob/main/production/helm/loki/values.yaml
# https://github.com/grafana/loki/issues/8524
values:
loki:
storage:
bucketNames:
chunks: "{{ b2_loki_bucket }}"
ruler: "{{ b2_loki_bucket }}"
admin: "{{ b2_loki_bucket }}"
type: s3
s3:
s3: "s3://{{ b2_loki_bucket }}"
endpoint: s3.eu-central-003.backblazeb2.com
region: eu-central-003
accessKeyId: "{{ b2_loki_key_id }}"
secretAccessKey: "{{ b2_loki_secret }}"
s3ForcePathStyle: true
insecure: false
# https://github.com/grafana/loki/issues/4613#issuecomment-1855200860
limits_config:
split_queries_by_interval: "1h"
query_scheduler:
max_outstanding_requests_per_tenant: 2048
gateway:
nginxConfig:
resolver: coredns.kube-system.svc.cluster.local
resources: "{{ resources }}"
monitoring:
lokiCanary:
resources:
limits:
cpu: 10m
memory: 64Mi
requests: {}
selfMonitoring:
grafanaAgent:
resources: "{{ resources }}"
backend:
persistence:
size: "{{ 10 * disk_multiplier|float }}Gi"
resources: "{{ resources }}"
read:
persistence:
size: "{{ 10 * disk_multiplier|float }}Gi"
resources: "{{ resources }}"
singleBinary:
persistence:
size: "{{ 10 * disk_multiplier|float }}Gi"
resources: "{{ resources }}"
write:
persistence:
size: "{{ 10 * disk_multiplier|float }}Gi"
resources: "{{ resources }}"
- name: Deploy promtail (collects logs and sends them to loki)
kubernetes.core.helm:
name: promtail
release_namespace: grafana-system
create_namespace: yes
chart_ref: grafana/promtail
# https://github.com/grafana/helm-charts/blob/main/charts/promtail/values.yaml
# https://github.com/grafana/loki/issues/8524
values:
config:
clients:
- url: http://loki-gateway/loki/api/v1/push
tenant_id: system
snippets:
# https://grafana.com/blog/2022/03/21/how-relabeling-in-prometheus-works/#the-base-relabel_config-block
extraRelabelConfigs:
- regex: "(?P<tenant>bb|aa)-.*"
source_labels:
- "namespace"
action: replace
target_label: tenant
pipelineStages:
- cri: {}
- match:
selector: '{tenant=~".+"}'
stages:
- tenant:
label: "tenant"
- output:
source: message
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 50m
memory: 128Mi
# TODO: ADD CHART VERSIONS TO ALL OF THESE
- name: Deploy grafana
kubernetes.core.helm:
name: grafana
release_namespace: grafana-system
create_namespace: yes
chart_ref: grafana/grafana
# https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml
# https://github.com/grafana/loki/issues/8524
values:
adminUser: "{{ grafana_admin_username }}"
adminPassword: "{{ grafana_admin_password }}"
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
persistence:
size: "{{ 10 * disk_multiplier|float }}Gi"
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: "Logs (System)"
type: loki
url: "http://loki-gateway.grafana-system.svc.cluster.local"
jsonData:
httpHeaderName1: 'X-Scope-OrgID'
secureJsonData:
httpHeaderValue1: 'system'
- name: "Logs (AA)"
type: loki
url: "http://loki-gateway.grafana-system.svc.cluster.local"
jsonData:
httpHeaderName1: 'X-Scope-OrgID'
secureJsonData:
httpHeaderValue1: 'aa'
- name: "Logs (BB)"
type: loki
url: "http://loki-gateway.grafana-system.svc.cluster.local"
jsonData:
httpHeaderName1: 'X-Scope-OrgID'
secureJsonData:
httpHeaderValue1: 'bb'
loki:
storage:
bucketNames:
chunks: "{{ b2_loki_bucket }}"
ruler: "{{ b2_loki_bucket }}"
admin: "{{ b2_loki_bucket }}"
type: s3
s3:
s3: "s3://{{ b2_loki_bucket }}"
endpoint: s3.eu-central-003.backblazeb2.com
region: eu-central-003
accessKeyId: "{{ b2_loki_key_id }}"
secretAccessKey: "{{ b2_loki_secret }}"
s3ForcePathStyle: true
insecure: false
This example was very helpful. our initial setup was missing the s3: "s3://{{ b2_loki_bucket }}"
. Writes worked fine to the bucket but we were unable to query logs older than 2-3hrs.
The endpoint had been configured as endpoint: s3.amazonaws.com/xxxx-loki
which put everything under an extra path vs root of the s3 bucket. Also watch out for the snake vs camel casing in the helm chart. quite confusing.
The format that works now is (loki v2.9.6 chart v5.47.2
):
loki:
storage:
bucketNames:
chunks: "{{ loki_bucket }}"
ruler: "{{ loki_bucket }}"
admin: "{{ loki_bucket }}"
type: s3
s3:
s3: "s3://xxxxx-loki"
endpoint: s3.us-east-1.amazonaws.com
region: us-east-1
accessKeyId: "{{ loki_key_id }}"
secretAccessKey: "{{ loki_secret }}"
s3ForcePathStyle: true
insecure: false
@pbsladek to configure the extra path, did you set it for the s3
value, or the endpoint
? Or both?
Update: To set up bucket subpath (path prefix), I had to append it to the s3.endpoint
field. In my case the s3.s3
did nothing, whether it was enabled or not. So my config looked like:
loki:
storage:
bucketNames:
chunks: "{{ loki_bucket }}"
ruler: "{{ loki_bucket }}"
admin: "{{ loki_bucket }}"
type: s3
s3:
# s3: "s3://xxxxx-loki"
endpoint: s3.us-east-1.amazonaws.com/path/prefix
region: us-east-1
accessKeyId: "{{ loki_key_id }}"
secretAccessKey: "{{ loki_secret }}"
s3ForcePathStyle: true
insecure: false
However, when I did so, I got an error about an invalid s3.region
. It was telling me that it needs to be us-east-1
, even when it already WAS set to that region.
I also tried debugging the S3 endpoint / bucket / path combo with s3-proxy. s3-proxy allows to set up webhooks that are triggered on requests. The webhook event contains info about which S3 path was requested. I've set it up to send the webhook event to http-echo service so the S3 path data was printed to the logs.
However, it seems s3-proxy doesn't handle multipart uploads, or there was some kind of error along those lines.
So in my case I didn't manage to set the bucket subpath, and instead decided to just go with another bucket.
Hello Folks,
Welcome to the party 😄
I am using the latest helm_chart for deploymentMode: SimpleScalable
; they have added some information.
To me, this looks like, three buckets are required, however admin bucket is optional if you are using Enterprise.
storage:
# Loki requires a bucket for chunks and the ruler. GEL requires a third bucket for the admin API.
# Please provide these values if you are using object storage.
# bucketNames:
# chunks: FIXME
# ruler: FIXME
# admin: FIXME
I think the second section, is semy optional. For example, if you are not able to resolve the region automatically or your S3 endpoint is custom.. you have to configure this. Also, don't touch it.
In my case, I am using a VPC s3 endpoint, with network routing.. So I don't touch the endpoint. I have configured only the region.
However, I still have some doubts, about whether this is correct, and which S3 bucket I have to provide in s3.s3 : ""
since we are using 3 buckets already...
s3:
s3: null
endpoint: null
region: null
secretAccessKey: null
accessKeyId: null
signatureVersion: null
s3ForcePathStyle: false
insecure: false
I am a bit lost regarding storage configuration for the latest grafana/loki helm chart. The default values may make it look like a simple task but that is not what I am experiencing. Just trying to configure an S3 bucket using the default values has been a guessing game...
storage.s3.s3
and what should the value of it be?insecure: false
?loki_cluster_seed.json
file but the writer is erroring with InvalidAccessPoint, what configuration is missing? (err="failed to flush chunks: store put chunk: InvalidAccessPoint: The specified accesspoint name or account is not valid
)What should be done?