Closed Gupta-Amrit closed 3 years ago
Hello @Gupta-Amrit Which version of operator do you use ? Which command/yaml did you use to deploy the data-index ? Could you provide logs from the operator ?
@Gupta-Amrit Can you please also share the steps you did to deploy the travel-agency example?
Below are the images that are getting used for different operators :
Operator Image
infinispan jboss/infinispan-operator:1.1.1.Final
kogito quay.io/kiegroup/kogito-cloud-operator:0.16.0
strimzi strimzi/operator:0.17.0
data-index quay.io/kiegroup/kogito-data-index:0.16
Steps followed
$ cd ~/kogito-cloud-operator-0.16.0 $ export NAMESPACE=kogito $ kubectl create ns $NAMESPACE $ ./hack/install.sh
$ cd ~/kogito-cloud-operator-0.16.0 $ ./examples/kubernetes/travel-agency/deploy.sh
NOTE : deploy.sh script is not working properly due to incorrect path set on the EXAMPLES_DIR. I have raised a PR #608 to fix it. Also kogito-travels and kogito-visas images are not public and requires authorization to use it but that is fine as the source code is available so I can build the image from it but to just to inform those images are not public anymore.
BTW, I am following this documentation(https://docs.jboss.org/kogito/release/latest/html_single/#proc-kogito-deploying-on-kubernetes_kogito-deploying-on-openshift)
Kogito operator logs
{"level":"info","T":"2020-10-09T07:54:56.036Z","logger":"kogitodataindex_controller","msg":"Injecting Data Index URL into KogitoRuntime services in the namespace 'kogito'","Request.Namespace":"kogito","Request.Name":"data-index"} {"level":"info","T":"2020-10-09T07:54:56.099Z","logger":"services_definition","msg":"Updating status for Kogito Service data-index"} {"level":"info","T":"2020-10-09T07:54:56.117Z","logger":"services_definition","msg":"Successfully reconciled Kogito Service data-index"} {"level":"info","T":"2020-10-09T07:54:56.117Z","logger":"kogitodataindex_controller","msg":"Reconciling KogitoDataIndex","Request.Namespace":"kogito","Request.Name":"data-index"} {"level":"info","T":"2020-10-09T07:54:56.117Z","logger":"kogitodataindex_controller","msg":"Injecting Data Index URL into KogitoRuntime services in the namespace 'kogito'","Request.Namespace":"kogito","Request.Name":"data-index"} {"level":"info","T":"2020-10-09T07:54:56.175Z","logger":"services_definition","msg":"Updating status for Kogito Service data-index"} {"level":"info","T":"2020-10-09T07:54:56.187Z","logger":"services_definition","msg":"Successfully reconciled Kogito Service data-index"}
infinispan operator logs
{"level":"info","ts":1602254293.1302795,"logger":"controller_infinispan","msg":"Reconciling Infinispan","Request.Namespace":"kogito","Request.Name":"kogito-infinispan"} {"level":"info","ts":1602254293.130337,"logger":"controller_infinispan","msg":"Configuring the StatefulSet","Request.Namespace":"kogito","Request.Name":"kogito-infinispan"} {"level":"error","ts":1602254293.138723,"logger":"controller_infinispan","msg":"failed to update Infinispan Spec","Request.Namespace":"kogito","Request.Name":"kogito-infinispan","error":"Infinispan.infinispan.org \"kogito-infinispan\" is invalid: [spec.expose.type: Required value,spec.logging.categories: Invalid value: \"null\": spec.logging.categories in body must be of type object: \"null\", spec.service.sites.locations: Invalid value: \"null\": spec.service.sites.locations in body must be of type array: \"null\", spec.service.sites.local.expose.type: Required value]","stacktrace":"github.com/go-logr/zapr.(zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128\ngithub.com/infinispan/infinispan-operator/pkg/controller/infinispan.updateSecurity\n\t/infinispan-operator/pkg/controller/infinispan/infinispan_controller.go:625\ngithub.com/infinispan/infinispan-operator/pkg/controller/infinispan.(ReconcileInfinispan).Reconcile\n\t/infinispan-operator/pkg/controller/infinispan/infinispan_controller.go:217\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.8/pkg/internal/controller/controller.go:213\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.8/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20181126123746-eddba98df674/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20181126123746-eddba98df674/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20181126123746-eddba98df674/pkg/util/wait/wait.go:88"} {"level":"error","ts":1602254293.1388295,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"infinispan-controller","request":"kogito/kogito-infinispan","error":"Infinispan.infinispan.org \"kogito-infinispan\" is invalid: [spec.expose.type: Required value, spec.logging.categories: Invalid value: \"null\": spec.logging.categories in body must be of type object: \"null\", spec.service.sites.locations: Invalid value: \"null\": spec.service.sites.locations in body must be of type array: \"null\", spec.service.sites.local.expose.type: Required value]","stacktrace":"github.com/go-logr/zapr.(zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.8/pkg/internal/controller/controller.go:215\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.8/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20181126123746-eddba98df674/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20181126123746-eddba98df674/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20181126123746-eddba98df674/pkg/util/wait/wait.go:88"}
@Gupta-Amrit Thanks for sharing all the info. The travel agency example was unfortunately not updated to reflect infrastructure changes done for 0.16. I have reported https://issues.redhat.com/browse/KOGITO-3579 to adjust the example.
The current scripts should be compatible with 0.15 version of the operator. In case you would like to try it without waiting for fix then please install 0.15 operator (by using 0.15.x branch of this repository or by using OLM to install the operator).
hey @sutaakar I tried using 0.15.0 operator but getting below error.
{"level":"error","ts":1602338878.2972353,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"kogitodataindex-controller","request":"kogito/data-index","error":"resource name may not be empty","stacktrace":"github.com/go-logr/zapr.(zapLogger).Error\n\t/home/jenkins/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler\n\t/home/jenkins/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem\n\t/home/jenkins/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).worker\n\t/home/jenkins/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/home/jenkins/go/pkg/mod/k8s.io/apimachinery@v0.18.3/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/home/jenkins/go/pkg/mod/k8s.io/apimachinery@v0.18.3/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/home/jenkins/go/pkg/mod/k8s.io/apimachinery@v0.18.3/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/home/jenkins/go/pkg/mod/k8s.io/apimachinery@v0.18.3/pkg/util/wait/wait.go:90"}
Could you please help me... am i missing any step or should I also try with OLM ?
@Gupta-Amrit When I applied changes from https://github.com/kiegroup/kogito-cloud-operator/pull/609 (and fixed Data index image version) then Data index was successfully deployed using Kogito operator 0.16. Please try it too.
@sutaakar I am still getting the same error. Could you please share the version and steps you followed to deploy ?
@Gupta-Amrit Here are the steps I used, running against KOPS 1.17:
Checkout operator branch 0.16.x (https://github.com/kiegroup/kogito-cloud-operator/tree/0.16.x)
Cherrypick both commits from https://github.com/kiegroup/kogito-cloud-operator/pull/609 (will be backported to 0.16.x once merged)
Cherrypick your fix https://github.com/kiegroup/kogito-cloud-operator/commit/528cb4443ebfea3c1d6185ecd76dc4c3e745428b
export NAMESPACE=kogito
kubectl create ns $NAMESPACE
./hack/install.sh
kubectl apply -f deploy/operator.yaml -n "${NAMESPACE}"
(Needed as for some reason the operator deployment creation is not triggered on my machine, to be investigated)
kubectl get pods -n kogito
(Waiting until operator is up and running)
./examples/kubernetes/travel-agency/deploy.sh
Wait until the Kafka is fully initialized, then delete Data index pod for respin (bug, Data index is created even when infrastructure is not fully initialized yet, will be fixed soon)
The result is a running Data index pod.
@sutaakar Thank you for sharing the steps. I have followed the steps are data-index is running fine. Also, I found out one more issue. Not sure whether you have encountered the same but I am using AKS(Kubernetes 1.17.1 version) cluster and kogito-infinispan statefulset pod was throwing below error
13:17:00,346 FATAL (main) [org.infinispan.SERVER] ISPN080028: Infinispan Server failed to start java.util.concurrent.ExecutionException: org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.commons.CacheConfigurationException: ISPN000512: Cannot acquire lock'/opt/infinispan/server/data/global.lck' for persistent global state at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) at org.infinispan.server.Bootstrap.runInternal(Bootstrap.java:140) at org.infinispan.server.tool.Main.run(Main.java:98) at org.infinispan.server.Bootstrap.main(Bootstrap.java:40) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.infinispan.server.loader.Loader.run(Loader.java:76) at org.infinispan.server.loader.Loader.main(Loader.java:39) Caused by: org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.commons.CacheConfigurationException: ISPN000512: Cannot acquire lock '/opt/infinispan/server/data/global.lck' for persistent global state at org.infinispan.manager.DefaultCacheManager.internalStart(DefaultCacheManager.java:751) at org.infinispan.manager.DefaultCacheManager.start(DefaultCacheManager.java:717) at org.infinispan.server.SecurityActions.lambda$startCacheManager$1(SecurityActions.java:64) at org.infinispan.security.Security.doPrivileged(Security.java:46) at org.infinispan.server.SecurityActions.doPrivileged(SecurityActions.java:36) at org.infinispan.server.SecurityActions.startCacheManager(SecurityActions.java:67) at org.infinispan.server.Server.run(Server.java:332) ... 9 more Caused by: org.infinispan.commons.CacheConfigurationException: ISPN000512: Cannot acquire lock '/opt/infinispan/server/data/global.lck' for persistent global state at org.infinispan.globalstate.impl.GlobalStateManagerImpl.acquireGlobalLock(GlobalStateManagerImpl.java:87) at org.infinispan.globalstate.impl.GlobalStateManagerImpl.start(GlobalStateManagerImpl.java:64) at org.infinispan.globalstate.impl.CorePackageImpl$1.start(CorePackageImpl.java:34) at org.infinispan.globalstate.impl.CorePackageImpl$1.start(CorePackageImpl.java:27) at org.infinispan.factories.impl.BasicComponentRegistryImpl.invokeStart(BasicComponentRegistryImpl.java:592) at org.infinispan.factories.impl.BasicComponentRegistryImpl.doStartWrapper(BasicComponentRegistryImpl.java:583) at org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:552) at org.infinispan.factories.impl.BasicComponentRegistryImpl.access$700(BasicComponentRegistryImpl.java:30) at org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.running(BasicComponentRegistryImpl.java:775) at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:341) at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:237) at org.infinispan.manager.DefaultCacheManager.internalStart(DefaultCacheManager.java:746) ... 15 more Caused by: java.io.FileNotFoundException: /opt/infinispan/server/data/global.lck (Permission denied) at java.base/java.io.FileOutputStream.open0(Native Method) at java.base/java.io.FileOutputStream.open(FileOutputStream.java:298) at java.base/java.io.FileOutputStream.
(FileOutputStream.java:237) at java.base/java.io.FileOutputStream. (FileOutputStream.java:187) at org.infinispan.globalstate.impl.GlobalStateManagerImpl.acquireGlobalLock(GlobalStateManagerImpl.java:81) ... 26 more 13:17:00,351 INFO (Thread-0) [org.infinispan.SERVER] ISPN080002: Infinispan Server stopping 13:17:00,356 INFO (Thread-0) [org.infinispan.SERVER] ISPN080003: Infinispan Server stopped
I crossed checked the persistent volume and persistent volume claim, both of them were created and attached to the pod. On further debugging I found out that the it was running with user id 185 and this user was not having the access to write on the volume. So I changed it to the root user and it worked. As root user is not recommended for production so I looking for someother workaround or solution. Did you also faced the same issue ??
Hi @Gupta-Amrit
This seems a problem related to the Infinispan Operator itself. I haven't seen this error before, but this one might be related? https://github.com/infinispan/infinispan-operator/issues/392
Try adding:
- name: MAKE_DATADIR_WRITABLE
value: "true"
in the env
attribute of the Infinispan operator's yaml file.
Hi @Gupta-Amrit have you managed to have this working?
Hi @ricardozanini I did not get the chance to try your changes for the Infinispan operator but as of now I am using runAsUser(root) attribute in the security-context to fix it. And data-index is working now. As now all the required changes are already merged the only thing which would be required to fix data-index CrashLoopBackOff is to wait until the Kafka is fully initialized and then delete data-index pod for respin. Btw, I also faced some other issues with Kogito management console and Infinispan on kubernetes .. I am still working on it.. once done I will try to create a PR. Feel free to close this ticket.
Thank you for the support
Looks like the problem is that no TLS is used. See https://github.com/kiegroup/kogito-examples/issues/454
Oh the TLS issue has been fixed in #646
For Infinispan 10.x that would work without TLS.
I am trying to deploy travel-agency example on Kubernetes but data-index pod is going to CrashLoopBackOff and throwing below error :
2020-10-09 12:10:58,277 WARN [io.qua.config] (main) Unrecognized configuration key "quarkus.kafka.bootstrap-servers" was provided; it will be ignored; verify that the dependency extension for this configuration is set or you did not make a typo 2020-10-09 12:10:59,407 INFO [org.inf.HOTROD] (main) ISPN004021: Infinispan version: Infinispan 'Corona Extra' 11.0.3.Final 2020-10-09 12:10:59,439 ERROR [org.inf.HOTROD] (HotRod-client-async-pool-1-1) ISPN004007: Exception encountered. Retry 10 out of 10: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:11222 Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124) at io.netty.channel.unix.Socket.finishConnect(Socket.java:243) at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:672) at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:649) at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:529) at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:465) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)
Thank you for your work.
Best regards,