Comcast / cmb

This project is no longer actively supported. It is made available as read-only. A highly available, horizontally scalable queuing and notification service compatible with AWS SQS and SNS
Apache License 2.0
277 stars 50 forks source link

High availability setup? #51

Open matti opened 8 years ago

matti commented 8 years ago

Are there any docs on how setup multiple cmb frontends? I have a working cassandra cluster and cmb works when there is only one intance. Scaling instances gives following exceptions:

localgyver-cmb-sqs-3 | 2016-09-20 07:27:33,503 [CNSEPJobProducer-9] [b72a12c7-a70e-4dc9-85b2-0172dcbb9f48] ERROR CNSEndpointPublisherJobProducer - event=job_producer_failure
localgyver-cmb-sqs-3 | com.amazonaws.services.sqs.model.QueueDoesNotExistException: The supplied queue with url http://localhost:6059/474356230103/PublishJobQ_1 doesn't exist (Service: AmazonSQS; Status Code: 400; Error Code: AWS.SimpleQueueService.NonExistentQueue; Request ID: 5625f563-6d50-4324-b9b1-c1964a0cfd27)
localgyver-cmb-sqs-4 |  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
localgyver-cmb-sqs-4 |  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
localgyver-cmb-sqs-4 |  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
localgyver-cmb-sqs-4 |  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
localgyver-cmb-sqs-3 |  at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1077)
localgyver-cmb-sqs-3 |  at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:725)
localgyver-cmb-sqs-3 |  at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
localgyver-cmb-sqs-3 |  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
localgyver-cmb-sqs-3 |  at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2339)
localgyver-cmb-sqs-3 |  at com.amazonaws.services.sqs.AmazonSQSClient.receiveMessage(AmazonSQSClient.java:1072)
localgyver-cmb-sqs-3 |  at com.comcast.cns.tools.CQSHandler.receiveMessage(CQSHandler.java:147)
localgyver-cmb-sqs-3 |  at com.comcast.cns.tools.CNSEndpointPublisherJobProducer.run(CNSEndpointPublisherJobProducer.java:129)
localgyver-cmb-sqs-4 |  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
localgyver-cmb-sqs-3 |  at com.comcast.cns.tools.CNSPublisherJobThread.run(CNSPublisherJobThread.java:43)
localgyver-cmb-sqs-3 | 2016-09-20 07:27:33,537 [pool-22-thread-58] [862e2209-86ff-4bc4-8e4a-1cb23ce9eb1f] INFO  CMBControllerServlet - event=req status=ok client=127.0.0.1 queue=null Action=GetQueueUrl Version=2012-11-05 QueueName=PublishJobQ_0 user=cns_internal resp_ms=7 cass_ms=6 cass_num_rd=1 cass_num_wr=0 redis_ms=0 io_ms=0 asyncq_ms=1 auth_ms=0
localgyver-cmb-sqs-4 |  at java.lang.Thread.run(Thread.java:745)
localgyver-cmb-sqs-4 | 2016-09-20 07:27:33,530 [pool-22-thread-169] [26ba5d11-2b08-471e-b37b-621072597880] INFO  CMBControllerServlet - event=req status=ok client=127.0.0.1 queue=/ Action=GetQueueUrl Version=2012-11-05 QueueName=PublishJobQ_0 user=cns_internal resp_ms=26 cass_ms=4 cass_num_rd=1 cass_num_wr=0 redis_ms=0 io_ms=16 asyncq_ms=3 auth_ms=0
localgyver-cmb-sqs-4 | 2016-09-20 07:27:33,585 [pool-22-thread-171] [646da8f2-28f4-414c-8a91-69bf8782ba10] INFO  CMBControllerServlet - event=req status=ok client=127.0.0.1 queue=/ Action=GetQueueUrl Version=2012-11-05 QueueName=PublishJobQ_1 user=cns_internal resp_ms=17 cass_ms=4 cass_num_rd=1 cass_num_wr=0 redis_ms=0 io_ms=5 asyncq_ms=1 auth_ms=0
localgyver-cmb-sqs-3 | 2016-09-20 07:27:33,620 [pool-22-thread-115] [85871c38-0912-4db6-803b-f75cac56e280] INFO  CMBControllerServlet - event=req status=ok client=127.0.0.1 queue=/ Action=GetQueueUrl Version=2012-11-05 QueueName=PublishJobQ_1 user=cns_internal resp_ms=34 cass_ms=33 cass_num_rd=1 cass_num_wr=0 redis_ms=0 io_ms=1 asyncq_ms=7 auth_ms=0
localgyver-cmb-sqs-4 | 2016-09-20 07:27:33,626 [pool-22-thread-168] [43d22b25-25ae-4f99-b33d-cb24d1226b97] INFO  CMBControllerServlet - event=req status=ok client=127.0.0.1 queue=/ Action=GetQueueUrl Version=2012-11-05 QueueName=EndpointPublishQ_0 user=cns_internal resp_ms=14 cass_ms=8 cass_num_rd=1 cass_num_wr=0 redis_ms=0 io_ms=1 asyncq_ms=1 auth_ms=0
localgyver-cmb-sqs-3 | 2016-09-20 07:27:33,665 [pool-22-thread-233] [d241113f-547d-4230-85e5-46eb2dd60b7a] ERROR CMBControllerServlet - event=req status=failed client=127.0.0.1 queue=/474356230103/PublishJobQ_0 Action=ReceiveMessage MaxNumberOfMessages=1 Version=2012-11-05 WaitTimeSeconds=20 MessageAttributeName.1=All user=cns_internal resp_ms=7 cass_ms=0 cass_num_rd=0 cass_num_wr=0 redis_ms=0 io_ms=0 asyncq_ms=10 auth_ms=6
localgyver-cmb-sqs-3 | com.comcast.cmb.common.util.PersistenceException: The supplied queue with url http://localhost:6059/474356230103/PublishJobQ_0 doesn't exist
localgyver-cmb-sqs-3 |  at com.comcast.cqs.controller.CQSCache.getCachedQueue(CQSCache.java:119)
localgyver-cmb-sqs-3 |  at com.comcast.cqs.controller.CQSControllerServlet.handleAction(CQSControllerServlet.java:220)
localgyver-cmb-sqs-3 |  at com.comcast.cmb.common.controller.CMBControllerServlet.handleRequest(CMBControllerServlet.java:511)
localgyver-cmb-sqs-3 |  at com.comcast.cmb.common.controller.CMBControllerServlet.access$200(CMBControllerServlet.java:65)
localgyver-cmb-sqs-4 | 2016-09-20 07:27:33,708 [pool-22-thread-43] [0e6c7f0e-3b97-4996-b41e-c70cf42e8fe0] INFO  CMBControllerServlet - event=req status=ok client=127.0.0.1 queue=/ Action=GetQueueUrl Version=2012-11-05 QueueName=EndpointPublishQ_1 user=cns_internal resp_ms=32 cass_ms=27 cass_num_rd=1 cass_num_wr=0 redis_ms=0 io_ms=0 asyncq_ms=3 auth_ms=0
localgyver-cmb-sqs-3 |  at com.comcast.cmb.common.controller.CMBControllerServlet$2.run(CMBControllerServlet.java:796)
localgyver-cmb-sqs-3 |  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
localgyver-cmb-sqs-4 | 2016-09-20 07:27:33,733 [pool-22-thread-3] [f7cf7ae1-7bbf-4921-9994-1cabd4f94be0] ERROR CMBControllerServlet - event=req status=failed client=127.0.0.1 queue=/474356230103/PublishJobQ_1 Action=ReceiveMessage MaxNumberOfMessages=1 Version=2012-11-05 WaitTimeSeconds=20 MessageAttributeName.1=All user=cns_internal resp_ms=0 cass_ms=0 cass_num_rd=0 cass_num_wr=0 redis_ms=0 io_ms=0 asyncq_ms=1 auth_ms=0
localgyver-cmb-sqs-4 | com.comcast.cmb.common.util.PersistenceException: The supplied queue with url http://localhost:6059/474356230103/PublishJobQ_1 doesn't exist
localgyver-cmb-sqs-4 |  at com.comcast.cqs.controller.CQSCache.getCachedQueue(CQSCache.java:119)
localgyver-cmb-sqs-4 |  at com.comcast.cqs.controller.CQSControllerServlet.handleAction(CQSControllerServlet.java:220)
localgyver-cmb-sqs-4 |  at com.comcast.cmb.common.controller.CMBControllerServlet.handleRequest(CMBControllerServlet.java:511)
localgyver-cmb-sqs-4 |  at com.comcast.cmb.common.controller.CMBControllerServlet.access$200(CMBControllerServlet.java:65)
localgyver-cmb-sqs-3 |  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
localgyver-cmb-sqs-3 |  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
localgyver-cmb-sqs-4 |  at com.comcast.cmb.common.controller.CMBControllerServlet$2.run(CMBControllerServlet.java:796)
localgyver-cmb-sqs-4 |  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
localgyver-cmb-sqs-4 |  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
localgyver-cmb-sqs-4 |  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
localgyver-cmb-sqs-3 |  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
localgyver-cmb-sqs-4 |  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
localgyver-cmb-sqs-3 |  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
localgyver-cmb-sqs-3 |  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
localgyver-cmb-sqs-4 |  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
localgyver-cmb-sqs-4 |  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
localgyver-cmb-sqs-4 |  at java.lang.Thread.run(Thread.java:745)
localgyver-cmb-sqs-4 | 2016-09-20 07:27:33,776 [CNSEPJobConsumer-15] [cb31eff2-e5d7-4112-acd0-ed20da225c14] ERROR CNSEndpointPublisherJobConsumer - event=job_consumer_failure
localgyver-cmb-sqs-4 | com.amazonaws.services.sqs.model.QueueDoesNotExistException: The supplied queue with url http://localhost:6059/474356230103/EndpointPublishQ_3 doesn't exist (Service: AmazonSQS; Status Code: 400; Error Code: AWS.SimpleQueueService.NonExistentQueue; Request ID: c3527c55-e128-4764-b51e-a087e0fc8a6c)
localgyver-cmb-sqs-3 |  at java.lang.Thread.run(Thread.java:745)
localgyver-cmb-sqs-3 | 2016-09-20 07:27:33,693 [CNSEPJobProducer-2] [7d6dd038-403b-45d4-805b-9be13b7ff455] ERROR CNSEndpointPublisherJobProducer - event=job_producer_failure
localgyver-cmb-sqs-3 | com.amazonaws.services.sqs.model.QueueDoesNotExistException: The supplied queue with url http://localhost:6059/474356230103/PublishJobQ_0 doesn't exist (Service: AmazonSQS; Status Code: 400; Error Code: AWS.SimpleQueueService.NonExistentQueue; Request ID: 9eef9470-5054-4b8d-8910-5ce1bc212784)
localgyver-cmb-sqs-3 |  at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1077)
localgyver-cmb-sqs-3 |  at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:725)
localgyver-cmb-sqs-4 |  at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1077)
localgyver-cmb-sqs-4 |  at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:725)
localgyver-cmb-sqs-4 |  at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
localgyver-cmb-sqs-4 |  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
localgyver-cmb-sqs-4 |  at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2339)
localgyver-cmb-sqs-4 |  at com.amazonaws.services.sqs.AmazonSQSClient.receiveMessage(AmazonSQSClient.java:1072)
localgyver-cmb-sqs-4 |  at com.comcast.cns.tools.CQSHandler.receiveMessage(CQSHandler.java:147)
localgyver-cmb-sqs-4 |  at com.comcast.cns.tools.CNSEndpointPublisherJobConsumer.run(CNSEndpointPublisherJobConsumer.java:245)
localgyver-cmb-sqs-3 |  at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
localgyver-cmb-sqs-4 |  at com.comcast.cns.tools.CNSPublisherJobThread.run(CNSPublisherJobThread.java:43)
localgyver-cmb-sqs-4 | 2016-09-20 07:27:33,803 [pool-22-thread-85] [664f66f7-cac5-40ff-b815-e77ee1724bc8] INFO  CMBControllerServlet - event=req status=ok client=127.0.0.1 queue=/ Action=GetQueueUrl Version=2012-11-05 QueueName=EndpointPublishQ_2 user=cns_internal resp_ms=11 cass_ms=7 cass_num_rd=1 cass_num_wr=0 redis_ms=0 io_ms=1 asyncq_ms=17 auth_ms=0
localgyver-cmb-sqs-4 | 2016-09-20 07:27:33,840 [CNSEPJobProducer-9] [1a376522-5fd2-4401-a3e6-50bfefc9e979] ERROR CNSEndpointPublisherJobProducer - event=job_producer_failure
localgyver-cmb-sqs-4 | com.amazonaws.services.sqs.model.QueueDoesNotExistException: The supplied queue with url http://localhost:6059/474356230103/PublishJobQ_1 doesn't exist (Service: AmazonSQS; Status Code: 400; Error Code: AWS.SimpleQueueService.NonExistentQueue; Request ID: 24dc204e-b772-44d4-99f1-0e503bfc8f98)
localgyver-cmb-sqs-3 |  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
localgyver-cmb-sqs-3 |  at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2339)
localgyver-cmb-sqs-3 |  at com.amazonaws.services.sqs.AmazonSQSClient.receiveMessage(AmazonSQSClient.java:1072)
boriwo commented 8 years ago

You need to adjust your cmb.properties file on all cmb machines in a high-availability setup. Queue names shouldn't use localhost except for a single instance deployment for testing.

You typically want to have your cmb instances behind a load balancer and then configure the dns of the load balancer as the service endpoint:

https://github.com/Comcast/cmb/blob/master/config/cmb.properties#L10-#L19

matti commented 8 years ago

Thank you!