eu-nebulous / optimiser-utility-evaluator

Mozilla Public License 2.0
0 stars 0 forks source link

Optimiser utility evaluator crashes permanently if recieves an invalid answer from SAL #3

Open jchmielewska opened 2 months ago

jchmielewska commented 2 months ago

During initial deployment of an app optimiser utility evaluator issues a request to SAL. If SAL responds incorrectly to that request (e.g: a 500 error), then utility evaluator crashes and won't respond to any other app deployment event.

This issue was reported by Robert and moved from Launchpad 2064650 https://bugs.launchpad.net/nebulous/+bug/2064650

robert-sanfeliu commented 1 month ago

Another instance of the error:

2024-07-19T09:09:39.849Z  INFO 1 --- [pool-3-thread-2] e.n.u.converter.VariableConverter        : Component: /spec/components/0
2024-07-19T09:09:39.849Z  INFO 1 --- [pool-3-thread-2] e.n.u.converter.VariableConverter        : meaning: replicas
2024-07-19T09:09:39.850Z  INFO 1 --- [pool-3-thread-2] e.n.u.converter.VariableConverter        : Adding new variable: spec_components_0_properties_traits_0_properties_replicas for component: /spec/components/0
2024-07-19T09:09:39.850Z  INFO 1 --- [pool-3-thread-2] e.n.u.converter.VariableConverter        : Component: /spec/components/0
2024-07-19T09:09:39.850Z  INFO 1 --- [pool-3-thread-2] e.n.u.converter.VariableConverter        : meaning: memory
2024-07-19T09:09:39.850Z  INFO 1 --- [pool-3-thread-2] e.n.u.converter.VariableConverter        : Component: /spec/components/0
2024-07-19T09:09:39.850Z  INFO 1 --- [pool-3-thread-2] e.n.u.converter.VariableConverter        : meaning: cpu
2024-07-19T09:09:39.850Z  INFO 1 --- [pool-3-thread-2] e.n.utilityevaluator.model.Application   : Application message successfully parsed
2024-07-19T09:09:39.850Z  INFO 1 --- [pool-3-thread-2] e.n.u.c.e.DslGenericMessageHandler       : Application e2a0cc58-4947-4755-8437-3510d9e93b09, with name Ubiwhere Test 4, has variables: {/spec/components/0=[eu.nebulous.utilityevaluator.model.VariableDTO@ba67ec3, eu.nebulous.utilityevaluator.model.VariableDTO@5f7b4598, eu.nebulous.utilityevaluator.model.VariableDTO@20c46407]}
2024-07-19T09:09:39.956Z  INFO 1 --- [pool-3-thread-2] e.n.u.c.s.NodeCandidatesFetchingService  : Received a response
2024-07-19T09:09:39.956Z ERROR 1 --- [pool-3-thread-2] e.n.u.c.s.NodeCandidatesFetchingService  : exn-middleware-sal request failed with error code '500' and message ' Request processing failed; nested exception is javax.security.auth.login.LoginException: Incorrect Username&#47;Password</p><p><b>Description</b> The server encountered an unexpected condition that prevented it from fulfilling the request.'
2024-07-19T09:09:39.956Z  INFO 1 --- [pool-3-thread-2] e.n.u.c.s.NodeCandidatesFetchingService  : Correctly return SAL response for component /spec/components/0, payload:
2024-07-19T09:09:39.957Z ERROR 1 --- [pool-3-thread-2] eu.nebulouscloud.exn.core.Manager        : General exception for topic://eu.nebulouscloud.ui.dsl.generic.>

java.lang.NullPointerException: null
        at java.base/java.util.Objects.requireNonNull(Unknown Source) ~[na:na]
        at java.base/java.util.Arrays$ArrayList.<init>(Unknown Source) ~[na:na]
        at java.base/java.util.Arrays.asList(Unknown Source) ~[na:na]
        at eu.nebulous.utilityevaluator.communication.sal.NodeCandidatesFetchingService.getNodeCandidatesViaMiddleware(NodeCandidatesFetchingService.java:57) ~[classes!/:0.0.1-SNAPSHOT]
        at eu.nebulous.utilityevaluator.UtilityEvaluatorController.createInitialCostPerformanceIndicators(UtilityEvaluatorController.java:43) ~[classes!/:0.0.1-SNAPSHOT]
        at eu.nebulous.utilityevaluator.communication.exnconnector.DslGenericMessageHandler.onMessage(DslGenericMessageHandler.java:49) ~[classes!/:0.0.1-SNAPSHOT]
        at eu.nebulous.utilityevaluator.communication.exnconnector.DslGenericMessageHandler$onMessage.call(Unknown Source) ~[na:na]
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) ~[groovy-3.0.10.jar!/:3.0.10]
        at eu.nebulouscloud.exn.core.Manager$3$onMessage.call(Unknown Source) ~[na:na]
        at eu.nebulouscloud.exn.core.Consumer.onDelivery(Consumer.groovy:77) ~[exn-connector-java-1.0-SNAPSHOT.jar!/:na]
        at jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) ~[na:na]
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[na:na]
        at java.base/java.lang.reflect.Method.invoke(Unknown Source) ~[na:na]
        at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:193) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.call(PogoMetaMethodSite.java:73) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:148) ~[groovy-3.0.10.jar!/:3.0.10]
        at eu.nebulouscloud.exn.core.Manager$4.run(Manager.groovy:174) ~[exn-connector-java-1.0-SNAPSHOT.jar!/:na]
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.FutureTask.run(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
        at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
robert-sanfeliu commented 1 month ago

@martaroz: consider adding a liveness prove to let Kubernetes restart the pod if needed. https://www.baeldung.com/ops/kubernetes-livenessprobe-readinessprobe

robert-sanfeliu commented 1 month ago

It also crashes permanently if it receives an invalid app deployment message.

 Received by custom handler ui_generic_message => eu.nebulouscloud.ui.dsl.generic.> = {when=2024-07-30T12:24:57.703192397Z}
2024-07-30T12:24:58.116Z  INFO 1 --- [pool-3-thread-2] e.n.u.c.e.DslGenericMessageHandler       : Body={when=2024-07-30T12:24:57.703192397Z}
2024-07-30T12:24:58.116Z ERROR 1 --- [pool-3-thread-2] e.n.utilityevaluator.model.Application   : Could not read app creation message

java.lang.IllegalArgumentException: argument "content" is null
        at com.fasterxml.jackson.databind.ObjectMapper._assertNotNull(ObjectMapper.java:5054) ~[jackson-databind-2.16.1.jar!/:2.16.1]
        at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:3276) ~[jackson-databind-2.16.1.jar!/:2.16.1]
        at eu.nebulous.utilityevaluator.external.KubevelaAnalyzer.parseKubevela(KubevelaAnalyzer.java:195) ~[classes!/:0.0.1-SNAPSHOT]
        at eu.nebulous.utilityevaluator.model.Application.<init>(Application.java:53) ~[classes!/:0.0.1-SNAPSHOT]
        at eu.nebulous.utilityevaluator.communication.exnconnector.DslGenericMessageHandler.onMessage(DslGenericMessageHandler.java:45) ~[classes!/:0.0.1-SNAPSHOT]
        at eu.nebulous.utilityevaluator.communication.exnconnector.DslGenericMessageHandler$onMessage.call(Unknown Source) ~[na:na]
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) ~[groovy-3.0.10.jar!/:3.0.10]
        at eu.nebulouscloud.exn.core.Manager$3$onMessage.call(Unknown Source) ~[na:na]
        at eu.nebulouscloud.exn.core.Consumer.onDelivery(Consumer.groovy:77) ~[exn-connector-java-1.0-SNAPSHOT.jar!/:na]
        at jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) ~[na:na]
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[na:na]
        at java.base/java.lang.reflect.Method.invoke(Unknown Source) ~[na:na]
        at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:193) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.call(PogoMetaMethodSite.java:73) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:148) ~[groovy-3.0.10.jar!/:3.0.10]
        at eu.nebulouscloud.exn.core.Manager$4.run(Manager.groovy:174) ~[exn-connector-java-1.0-SNAPSHOT.jar!/:na]
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.FutureTask.run(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
        at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]

2024-07-30T12:24:58.120Z ERROR 1 --- [pool-3-thread-2] eu.nebulouscloud.exn.core.Manager        : General exception for topic://eu.nebulouscloud.ui.dsl.generic.>

java.lang.NullPointerException: Cannot invoke "Object.toString()" because the return value of "eu.nebulous.utilityevaluator.model.Application.getVariables()" is null
        at eu.nebulous.utilityevaluator.communication.exnconnector.DslGenericMessageHandler.onMessage(DslGenericMessageHandler.java:46) ~[classes!/:0.0.1-SNAPSHOT]
        at eu.nebulous.utilityevaluator.communication.exnconnector.DslGenericMessageHandler$onMessage.call(Unknown Source) ~[na:na]
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) ~[groovy-3.0.10.jar!/:3.0.10]
        at eu.nebulouscloud.exn.core.Manager$3$onMessage.call(Unknown Source) ~[na:na]
        at eu.nebulouscloud.exn.core.Consumer.onDelivery(Consumer.groovy:77) ~[exn-connector-java-1.0-SNAPSHOT.jar!/:na]
        at jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) ~[na:na]
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[na:na]
        at java.base/java.lang.reflect.Method.invoke(Unknown Source) ~[na:na]
        at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:193) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.call(PogoMetaMethodSite.java:73) ~[groovy-3.0.10.jar!/:3.0.10]
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:148) ~[groovy-3.0.10.jar!/:3.0.10]
        at eu.nebulouscloud.exn.core.Manager$4.run(Manager.groovy:174) ~[exn-connector-java-1.0-SNAPSHOT.jar!/:na]
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.FutureTask.run(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
        at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
rudi commented 1 month ago

That's a different bug -- please post the invalid message and I'll fix and add to unit tests. (I'd love to have a JSON Schema definition of what's a valid app creation message so I could test everything up front instead of piecemeal :/ )

robert-sanfeliu commented 1 month ago

Ok, I open a separated bug for that: https://github.com/eu-nebulous/optimiser-utility-evaluator/issues/5