Open kbespalov opened 2 years ago
Hey @kbespalov can you share the configuration you're using for Vector?
@spencergilbert
Chart Version: 0.13.1
here is chart values configuration, nothing exceptional - sinks/sources/trasforms, almost default.
affinity: {}
args:
- --config-dir
- /etc/vector/
autoscaling:
customMetric: {}
enabled: false
maxReplicas: 10
minReplicas: 1
targetCPUUtilizationPercentage: 80
targetMemoryUtilizationPercentage: null
command: []
commonLabels: {}
containerPorts: []
customConfig:
sinks:
cloudwatch:
# ... omitted
type: aws_cloudwatch_logs
sources:
billing_golang_logs:
# ... omitted
type: kubernetes_logs
billing_python_logs:
# ... omitted
type: kubernetes_logs
transforms:
formatted_golang_logs:
# ... omitted
formatted_python_logs:
# ... omitted
type: remap
merged_python_logs:
# ... omitted
type: reduce
dataDir: ""
dnsConfig: {}
dnsPolicy: ClusterFirst
env: []
existingConfigMaps: []
extraVolumeMounts: []
extraVolumes: []
fullnameOverride: ""
haproxy:
affinity: {}
autoscaling:
customMetric: {}
enabled: false
maxReplicas: 10
minReplicas: 1
targetCPUUtilizationPercentage: 80
targetMemoryUtilizationPercentage: null
containerPorts: []
customConfig: ""
enabled: false
existingConfigMap: ""
extraVolumeMounts: []
extraVolumes: []
image:
pullPolicy: IfNotPresent
pullSecrets: []
repository: haproxytech/haproxy-alpine
tag: 2.4.17
initContainers: []
livenessProbe:
tcpSocket:
port: 1024
nodeSelector: {}
podAnnotations: {}
podLabels: {}
podPriorityClassName: ""
podSecurityContext: {}
readinessProbe:
tcpSocket:
port: 1024
replicas: 1
resources: {}
rollWorkload: true
securityContext: {}
service:
annotations: {}
ports: []
topologyKeys: []
type: ClusterIP
serviceAccount:
annotations: {}
automountToken: true
create: true
name: null
strategy: {}
terminationGracePeriodSeconds: 60
tolerations: []
image:
pullPolicy: IfNotPresent
pullSecrets: []
repository: timberio/vector
sha: ""
tag: ""
ingress:
annotations: {}
className: ""
enabled: false
hosts: []
tls: []
initContainers: []
livenessProbe: {}
nameOverride: ""
nodeSelector: {}
persistence:
accessModes:
- ReadWriteOnce
enabled: false
existingClaim: ""
finalizers:
- kubernetes.io/pvc-protection
hostPath:
path: /var/lib/vector
selectors: {}
size: 10Gi
podAnnotations: {}
podDisruptionBudget:
enabled: false
maxUnavailable: null
minAvailable: 1
podLabels: {}
podManagementPolicy: OrderedReady
podMonitor:
additionalLabels: {}
enabled: false
honorLabels: false
honorTimestamps: true
jobLabel: app.kubernetes.io/name
metricRelabelings: []
path: /metrics
port: prom-exporter
relabelings: []
podPriorityClassName: ""
podSecurityContext: {}
psp:
create: false
enabled: true
rbac:
create: true
readinessProbe: {}
replicas: 1
resources: {}
role: Agent
rollWorkload: true
secrets:
generic: {}
securityContext: {}
service:
annotations: {}
enabled: false
ports: []
topologyKeys: []
type: ClusterIP
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: .....
automountToken: true
create: true
name: vector-logging-agent
terminationGracePeriodSeconds: 60
tolerations: []
updateStrategy: {}
Maybe something is wrong with the pod security policy?
Interesting - I thought I had included some logic around this - but in your customConfig
you can set a different data_dir
, Vector defaults to /var/lib/vector
(which as you pointed out is RO in the mounts). The default configs set this to /vector-data-dir
to avoid the RO filesystem.
I think we can improve this by being more specific in what we mount from /var/lib
but updating the data_dir
key in your config should unblock you for now.
Adding this parameter explicitly to the settings solved my problem. Thank you!
# values yaml
customConfig:
data_dir: "/vector-data-dir"
I'm going to keep this open as I think we can improve our defaults to be more specific and cause less issues 😄
Will just uncommenting data_dir: "/vector-data-dir"
in customConfig will fix this issue, or do you want to plan for further improvement?
Will just uncommenting
data_dir: "/vector-data-dir"
in customConfig will fix this issue, or do you want to plan for further improvement?
I'd like to tighten up the mount config so we don't mount the entirety of /var/lib
to access kubernetes logs, which would resolve the default data dir issue. Today choosing a path that isn't under /var/lib
avoids the issue.
This is still an issue :(
Vector agent cannot start due to Read-only file system error
So, vector-agent is trying to create directories to store checkpoints json files, but there is no way to do that because
/var/lib
volume is mounted as RO: