mesosphere-backup / dcos-cassandra-service

DEPRECATED—Open source Apache Cassandra running on DC/OS is now replaced by mesosphere/dcos-commons/frameworks/cassandra. This repository will be deleted at the end of 2017.
Apache License 2.0
116 stars 53 forks source link

Cassandra Framework Deployment on Specific Nodes Issue #408

Open Nash897 opened 7 years ago

Nash897 commented 7 years ago

I am using DCOS open source 1.7 version with 50 slave nodes. I am facing issues while deploying Cassandra framework on mesos on specific nodes.

The scheduler is not able to satisfy the cpu requirement and the process is running in loop. I have enough CPU's remaining on DCOS.

Logs:

WARN  [2017-02-24 22:06:08,074] org.apache.mesos.curator.CuratorStateStore: No TaskInfo found for the requested name: node-0-task-template at: /dcos-service-abc/Tasks/node-0-task-template/TaskStatus
WARN  [2017-02-24 22:06:08,074] org.apache.mesos.scheduler.DefaultTaskKiller: Attempted to kill unknown task: node-0-task-template
WARN  [2017-02-24 22:06:08,075] org.apache.mesos.offer.OfferEvaluator: Failed to satisfy resource requirement: name: "cpus" type: SCALAR scalar { value: 2.0 } role: "abc-role" reservation { principal: "abc-principal" labels { labels { key: "resource_id" value: "f37c63bb-f355-4512-9b65-29df1ee56574" } } }
WARN  [2017-02-24 22:06:08,075] org.apache.mesos.scheduler.plan.DefaultPlanScheduler: Unable to find any offers which fulfill requirement provided by block node-0: org.apache.mesos.offer.OfferRequirement@761c3c7c[taskType=CASSANDRA_DAEMON,taskRequirements=[org.apache.mesos.offer.TaskRequirement@1820b4e6[taskInfo=name: "node-0"
task_id {
  value: "node-0__ed2e1e21-15a2-4aa0-9a2f-bfc567240585"
}

Logs for parsing warning message:

WARN  [2017-02-24 22:02:59,936] org.apache.mesos.dcos.Capabilities: Unable to parse version string for Named Vip
! java.lang.NumberFormatException: For input string: "7-open"
! at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[na:1.8.0_74]
! at java.lang.Integer.parseInt(Integer.java:580) ~[na:1.8.0_74]
! at java.lang.Integer.parseInt(Integer.java:615) ~[na:1.8.0_74]
! at org.apache.mesos.dcos.DcosVersion$Elements.getSecondElement(DcosVersion.java:46) ~[dcos-commons-0.7.12.jar:na]
! at org.apache.mesos.dcos.Capabilities.supportsNamedVips(Capabilities.java:26) ~[dcos-commons-0.7.12.jar:na]
! at com.mesosphere.dcos.cassandra.common.tasks.CassandraDaemonTask.getDiscoveryInfo(CassandraDaemonTask.java:322) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.common.tasks.CassandraDaemonTask.<init>(CassandraDaemonTask.java:164) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.common.tasks.CassandraDaemonTask.<init>(CassandraDaemonTask.java:179) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.common.tasks.CassandraDaemonTask.<init>(CassandraDaemonTask.java:44) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.common.tasks.CassandraDaemonTask$Factory.create(CassandraDaemonTask.java:85) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.common.config.ConfigurationManager.createDaemon(ConfigurationManager.java:51) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.common.tasks.CassandraState.createDaemon(CassandraState.java:210) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.common.tasks.CassandraState.getOrCreateDaemon(CassandraState.java:337) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.common.tasks.CassandraState.getOrCreateContainer(CassandraState.java:190) [cassandra-commons-1.0.16.jar:na]
! at com.mesosphere.dcos.cassandra.scheduler.plan.CassandraDaemonBlock.<init>(CassandraDaemonBlock.java:114) [cassandra-scheduler.jar:na]
! at com.mesosphere.dcos.cassandra.scheduler.plan.CassandraDaemonBlock.create(CassandraDaemonBlock.java:100) [cassandra-scheduler.jar:na]
! at com.mesosphere.dcos.cassandra.scheduler.plan.CassandraDaemonPhase.createBlocks(CassandraDaemonPhase.java:47) [cassandra-scheduler.jar:na]
! at com.mesosphere.dcos.cassandra.scheduler.plan.CassandraDaemonPhase.create(CassandraDaemonPhase.java:73) [cassandra-scheduler.jar:na]
! at com.mesosphere.dcos.cassandra.scheduler.plan.DeploymentManager.<init>(DeploymentManager.java:54) [cassandra-scheduler.jar:na]
! at com.mesosphere.dcos.cassandra.scheduler.plan.DeploymentManager.create(DeploymentManager.java:33) [cassandra-scheduler.jar:na]
! at com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler.registered(CassandraScheduler.java:152) [cassandra-scheduler.jar:na]

It also says the task is lost or unknown after it tries to recreate it in the next iteration of scheduling.

INFO  [2017-02-24 22:05:44,543] com.mesosphere.dcos.cassandra.common.tasks.CassandraState: Unable to store status. Reason: 
! org.apache.mesos.state.StateStoreException: Task ID 'node-0__d8db7f26-895e-40cb-91ff-8ab3622010a6' of updated status doesn't match Task ID 'node-0__976be4b3-40b7-452c-84a7-f4514b03d1fe' of current TaskInfo. Task IDs must exactly match before status may be updated. NewTaskStatus[task_id {
!   value: "node-0__d8db7f26-895e-40cb-91ff-8ab3622010a6"
! }
! state: TASK_LOST
! message: "Reconciliation: Task is unknown"
! timestamp: 1.487973944545495E9
! source: SOURCE_MASTER
! reason: REASON_RECONCILIATION
! ] CurrentTaskInfo[Optional[name: "node-0"
! task_id {
!   value: "node-0__976be4b3-40b7-452c-84a7-f4514b03d1fe"
! }
! slave_id {
!   value: "3108ec77-775b-4adb-881f-5366b7c242c8-S9"
! }

Is there a way to deal with this or if there is a way deploy cassandra framework on specific nodes?

Help is appreciated. Thanks in advance

tookko commented 7 years ago

+1 facing the same issue here.

triclambert commented 6 years ago

This repo is deprecated and will be archived in one week. Please see the latest version of Cassandra or DSE for DC/OS:

https://docs.mesosphere.com/service-docs/cassandra/ https://docs.mesosphere.com/service-docs/dse/ (enterprise-only)