vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.78k stars 604 forks source link

Vespa Session Not Getting Activated #24245

Closed 107dipan closed 2 years ago

107dipan commented 2 years ago

We are getting error related to session activation while deploying our vespa application. We recently updated the vespa version to 7.589. I tried to understand for which hostname this exception is getting thrown but unable to check that. Please let me know if anymore logs is required.

Error message in deployment response. INFO : configserver Container.com.yahoo.vespa.config.server.http.HttpErrorResponse Returning response with response code 500, error-code:INTERNAL_SERVER_ERROR, message=Timed out waiting for peer config servers to complete operation (waited for barrier /config/v2/tenants/default/sessions/7/activeBarrier).Got response from [], but need response from at least 1 server(s). Timeout passed as argument was 119994 ms

Found the following logs from the configserver -

[2022-09-28 10:35:38.997] INFO : configserver Container.com.yahoo.vespa.config.server.ApplicationRepository Session 7 prepared successfully. [2022-09-28 10:35:39.827] WARNING : configserver Container.com.yahoo.vespa.config.server.session.SessionStateWatcher Error handling session change to ACTIVATE for session 7\nexception=\njava.lang.IllegalArgumentException: hostname length must be at least '1' and at most '64', but got: '74'\n\tat ai.vespa.validation.Validation.require(Validation.java:69)\n\tat ai.vespa.validation.Validation.requireInRange(Validation.java:62)\n\tat ai.vespa.validation.Validation.requireLength(Validation.java:34)\n\tat com.yahoo.config.provision.HostName.(HostName.java:17)\n\tat com.yahoo.config.provision.HostName.of(HostName.java:24)\n\tat java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)\n\tat java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)\n\tat java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)\n\tat java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)\n\tat java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)\n\tat java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)\n\tat java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)\n\tat java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)\n\tat java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)\n\tat java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)\n\tat com.yahoo.vespa.service.duper.DuperModel.updateHostnameVsIdMaps(DuperModel.java:125)\n\tat com.yahoo.vespa.service.duper.DuperModel.add(DuperModel.java:100)\n\tat com.yahoo.vespa.service.duper.DuperModelManager$1.lambda$applicationActivated$0(DuperModelManager.java:85)\n\tat com.yahoo.vespa.service.duper.DuperModelManager.lockedRunnable(DuperModelManager.java:223)\n\tat com.yahoo.vespa.service.duper.DuperModelManager$1.applicationActivated(DuperModelManager.java:85)\n\tat com.yahoo.vespa.config.server.SuperModelManager.lambda$configActivated$1(SuperModelManager.java:104)\n\tat java.base/java.util.ArrayList.forEach(ArrayList.java:1541)\n\tat com.yahoo.vespa.config.server.SuperModelManager.configActivated(SuperModelManager.java:104)\n\tat com.yahoo.vespa.config.server.SuperModelRequestHandler.reloadConfig(SuperModelRequestHandler.java:52)\n\tat com.yahoo.vespa.config.server.rpc.RpcServer.reloadSuperModel(RpcServer.java:267)\n\tat com.yahoo.vespa.config.server.rpc.RpcServer.configActivated(RpcServer.java:262)\n\tat com.yahoo.vespa.config.server.application.TenantApplications.notifyReloadListeners(TenantApplications.java:226)\n\tat com.yahoo.vespa.config.server.application.TenantApplications.activateApplication(TenantApplications.java:243)\n\tat com.yahoo.vespa.config.server.session.SessionRepository.activate(SessionRepository.java:448)\n\tat com.yahoo.vespa.config.server.session.SessionStateWatcher.sessionStatusChanged(SessionStateWatcher.java:62)\n\tat com.yahoo.vespa.config.server.session.SessionStateWatcher.lambda$nodeChanged$0(SessionStateWatcher.java:102)\n\tat com.yahoo.concurrent.StripedExecutor.runAll(StripedExecutor.java:68)\n\tat com.yahoo.concurrent.StripedExecutor.lambda$execute$0(StripedExecutor.java:50)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\n [2022-09-28 10:37:40.076] WARNING : configserver Container.com.yahoo.vespa.config.server.http.v2.SessionActiveHandler Unexpected exception handling a config server request\nexception=\ncom.yahoo.vespa.curator.CompletionTimeoutException: Timed out waiting for peer config servers to complete operation (waited for barrier /config/v2/tenants/default/sessions/7/activeBarrier).Got response from [], but need response from at least 1 server(s). Timeout passed as argument was 119994 ms\n\tat com.yahoo.vespa.curator.CuratorCompletionWaiter.awaitCompletion(CuratorCompletionWaiter.java:51)\n\tat com.yahoo.vespa.config.server.ApplicationRepository$Activation.awaitCompletion(ApplicationRepository.java:1207)\n\tat com.yahoo.vespa.config.server.deploy.Deployment.waitForActivation(Deployment.java:151)\n\tat com.yahoo.vespa.config.server.deploy.Deployment.activate(Deployment.java:136)\n\tat com.yahoo.vespa.config.server.ApplicationRepository.activate(ApplicationRepository.java:450)\n\tat com.yahoo.vespa.config.server.http.v2.SessionActiveHandler.handlePUT(SessionActiveHandler.java:50)\n\tat com.yahoo.vespa.config.server.http.HttpHandler.handle(HttpHandler.java:44)\n\tat com.yahoo.container.jdisc.ThreadedHttpRequestHandler.handle(ThreadedHttpRequestHandler.java:78)\n\tat com.yahoo.container.jdisc.ThreadedHttpRequestHandler.handleRequest(ThreadedHttpRequestHandler.java:89)\n\tat com.yahoo.container.jdisc.ThreadedRequestHandler$RequestTask.processRequest(ThreadedRequestHandler.java:191)\n\tat com.yahoo.container.jdisc.ThreadedRequestHandler$RequestTask.run(ThreadedRequestHandler.java:185)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\n [2022-09-28 10:37:40.080] INFO : configserver Container.com.yahoo.vespa.config.server.http.HttpErrorResponse Returning response with response code 500, error-code:INTERNAL_SERVER_ERROR, message=Timed out waiting for peer config servers to complete operation (waited for barrier /config/v2/tenants/default/sessions/7/activeBarrier).Got response from [], but need response from at least 1 server(s). Timeout passed as argument was 119994 ms

kkraune commented 2 years ago
hostname length must be at least '1' and at most '64', but got: '74'

indicates that some of the hosts in hosts.xml have a too long hostname?

107dipan commented 2 years ago

Hey @kkraune, I have confirmed that the length of the hostname is indeed 74. Wondering if this validation was a new addition becuse we are using the same host name in a previous version of vespa too but never got this exception.

kkraune commented 2 years ago

7.589 is from 2022-05-24

https://github.com/vespa-engine/vespa/blame/master/config-provisioning/src/main/java/com/yahoo/config/provision/HostName.java (find filenames in the stracktrace and look up in the open source repo) says this check was added 2022-04-09 - so if your previous version is from before that, it explains it

A random page from the internet: https://learningnetwork.cisco.com/s/question/0D53i00000KsyjTCAR/hostname-length

If you are anyway upgrading, we strongly recommend using the most recent Vespa version, currently 8.57 - see https://docs.vespa.ai/en/vespa8-release-notes.html - good luck!