Open alexandreLamarre opened 1 year ago
Probably related to https://github.com/rancher/opni/issues/1183
Can we please collect logs from the manager pod.
There are no relevant error logs from the manager pod
All resources are created even the secret containing the internal opni user, but the setup is erroneous somehow because that user fails to authorize
The only other thing I can think of is to try enabling persistent storage. This may be related to that as I tend to always use it.
Yeah I tried installing it with persistent storage as well, and the issue persisted
Attaching the bootstrap logs. Looks like there's relevant info there opni-bootstrap-0_opensearch.log
Found some TLS handshake errors in cert manager webhook
0404 19:13:28.384262 1 logs.go:59] http: TLS handshake error from 10.0.17.245:58904: read tcp 10.0.17.241:10250->10.0.17.245:58904: read: connection reset by peer
I0404 19:13:28.434664 1 logs.go:59] http: TLS handshake error from 10.0.17.245:58916: EOF
I0404 19:13:28.444426 1 logs.go:59] http: TLS handshake error from 10.0.17.245:58932: read tcp 10.0.17.241:10250->10.0.17.245:58932: read: connection reset by peer
I0404 19:13:28.462671 1 logs.go:59] http: TLS handshake error from 10.0.17.245:58946: read tcp 10.0.17.241:10250->10.0.17.245:58946: read: connection reset by peer
originating from one of the dashboards pod.
Cert manager pod also complaining about its resource management:
I0404 23:06:01.731192 1 controller.go:162] cert-manager/certificates-readiness "msg"="re-queuing item due to optimistic locking on resource" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"opensearch-opni-internalopni\": the object has been modified; please apply your changes to the latest version and try again" "key"="opni/opensearch-opni-internalopni"
I0404 23:06:01.731537 1 conditions.go:192] Found status change for Certificate "opensearch-opni-internalopni" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2023-04-04 23:06:01.731528865 +0000 UTC m=+14652.919641752
I0404 23:06:01.741083 1 controller.go:162] cert-manager/certificates-readiness "msg"="re-queuing item due to optimistic locking on resource" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"opensearch-opni-admin\": the object has been modified; please apply your changes to the latest version and try again" "key"="opni/opensearch-opni-admin"
I0404 23:06:01.741344 1 conditions.go:192] Found status change for Certificate "opensearch-opni-admin" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2023-04-04 23:06:01.741336694 +0000 UTC m=+14652.929449578
I0404 23:06:01.770689 1 controller.go:162] cert-manager/certificates-readiness "msg"="re-queuing item due to optimistic locking on resource" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"opensearch-opni-internalopni\": the object has been modified; please apply your changes to the latest version and try again" "key"="opni/opensearch-opni-internalopni"
I0404 23:06:01.770995 1 conditions.go:192] Found status change for Certificate "opensearch-opni-internalopni" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2023-04-04 23:06:01.770987664 +0000 UTC m=+14652.959100552
I0404 23:06:01.806932 1 controller.go:162] cert-manager/certificates-key-manager "msg"="re-queuing item due to optimistic locking on resource" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"opensearch-opni-opni-indexer\": the object has been modified; please apply your changes to the latest version and try again" "key"="opni/opensearch-opni-opni-indexer"
[23:25:09] INFO tracing feature enabled: false {"controller": "multiclusterrolebinding", "controllerGroup": "logging.opni.io", "controllerKind": "MulticlusterRoleBinding", "MulticlusterRoleBinding": {"name":"opni","namespace":"opni"}, "namespace": "opni", "name": "opni", "reconcileID": "6a6a8b28-6046-4ebc-8772-a1dde8656a40"}
[23:25:09] ERROR Reconciler error {"controller": "multiclusterrolebinding", "controllerGroup": "logging.opni.io", "controllerKind": "MulticlusterRoleBinding", "MulticlusterRoleBinding": {"name":"opni","namespace":"opni"}, "namespace": "opni", "name": "opni", "reconcileID": "6a6a8b28-6046-4ebc-8772-a1dde8656a40", "error": "failed to create rolesmapping: [500 Internal Server Error] {\"status\":\"INTERNAL_SERVER_ERROR\",\"message\":\"Security index not initialized\"}"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:235
opni
logging.opni.io.Multicluster role binding :
status:
state: Error
spec:
opensearch:
name: opni
namespace: opni
opensearchConfig:
indexRetention: 7d
Observed behaviour
the opni-dashboards are unable to access open search, due to invalid auth setup
Expected behaviour
the opni-dashboards are able to access open search
Steps to reproduce
Attempted fix
opni.logging.io crds
were deleted