Open nashant opened 1 year ago
Which version of the image is running? If your are not already using the v1.0.0
image, please try to upgrade to that, as that has generally better error reporting.
Yup, already using v1.0.0
Any thoughts? Can I increase logging somehow?
You might be able to add a second entry to the piraeus-op-node-monitoring
configmap:
data:
log.toml: |
[[log]]
level = "debug"
Experiencing the same and even with the trace level, there is no more info:
We noticed that sometimes reactor still discards some log messages, especially the log message when creating the Prometheus socket. I assume in both cases this is related to reactor for some reason not being able to bind to [::]:9942
. As for why, I cannot tell. Perhaps some strange network configuration with disabling IPv6 on the kernel level?
Correct, we have IPv6 disabled in the kernel: GRUB_CMDLINE_LINUX="rd.lvm.lv=rhel/root rhgb quiet ipv6.disable=1"
I am also using ipFamilyPolicy: SingleStack
when creating LinstorCluster:
- target:
kind: Service
name: linstor-controller
patch: |-
apiVersion: v1
kind: service
metadata:
name: linstor-controller
spec:
ipFamilyPolicy: SingleStack
Then you probably need to patch the reactor config to use 0.0.0.0:9942
instead of the anylocal [::]
address. This normally works fine even on IPv4 only systems, but directly disabling the IPv6 subsystem tends to break those.
Please, can you point me where can I change this globally via CRD? When I edit nodename-reactor-config config map directly and delete the pod for that nodename to restart drbd-reactor, then the address in config map gets replaced back to [::]
, though log.tml part with trace stays there untouched (I added that also manually).
No drbd-reactor
container crash since I have set Prometheus address to 0.0.0.0:9942
:
apiVersion: piraeus.io/v1
kind: LinstorSatelliteConfiguration
metadata:
name: drbd-reactor-trace
spec:
patches:
- target:
kind: ConfigMap
name: reactor-config
patch: |
apiVersion: v1
kind: ConfigMap
metadata:
name: reactor-config
labels:
app.kubernetes.io/component: linstor-satellite
data:
prometheus.toml: |
[[prometheus]]
enums = true
address = "0.0.0.0:9942"
[[log]]
level = "trace"
log.toml: |
[[log]]
level = "trace"
One of my satellite pods is crashlooping. It's because of the
drbd-reactor
pod which is giving only the following logs:Any idea?