Closed elouanKeryell-Even closed 9 years ago
OS: CentOS 7 chronos-2.3.4-1.0.81.el7.x86_64.rpm mesos-0.22.1-1.0.centos701406.x86_64.rpm mesosphere-zookeeper-3.4.6-0.1.20141204175332.centos7.x86_64.rpm
When starting chronos:
$ systemctl start chronos
it crashes. Here are the logs:
Jul 6 19:33:17 master-1 systemd: Stopping Chronos... Jul 6 19:33:17 master-1 systemd: Starting Chronos... Jul 6 19:33:17 master-1 systemd: Started Chronos. Jul 6 19:33:17 master-1 chronos: + cmd=(run_jar) Jul 6 19:33:17 master-1 chronos: + local cmd Jul 6 19:33:17 master-1 chronos: + [[ -s /etc/mesos/zk ]] Jul 6 19:33:17 master-1 chronos: + cmd+=(--zk_hosts "$(cut -d / -f 3 /etc/mesos/zk)" --master "$(cat /etc/mesos/zk)") Jul 6 19:33:17 master-1 chronos: ++ cut -d / -f 3 /etc/mesos/zk Jul 6 19:33:17 master-1 chronos: ++ cat /etc/mesos/zk Jul 6 19:33:17 master-1 chronos: + [[ -d /etc/chronos/conf ]] Jul 6 19:33:17 master-1 chronos: + read -u 9 -r -d '' path Jul 6 19:33:17 master-1 chronos: ++ cd /etc/chronos/conf Jul 6 19:33:17 master-1 chronos: ++ find . -type f -not -name '.*' -print0 Jul 6 19:33:17 master-1 chronos: + local name=zk_path Jul 6 19:33:17 master-1 chronos: + element_in --zk_path Jul 6 19:33:17 master-1 chronos: + local e Jul 6 19:33:17 master-1 chronos: + return 1 Jul 6 19:33:17 master-1 chronos: + case "$name" in Jul 6 19:33:17 master-1 chronos: + cmd+=("--$name" "$(< "$conf_dir/$name")") Jul 6 19:33:17 master-1 chronos: + read -u 9 -r -d '' path Jul 6 19:33:17 master-1 chronos: + local name=hostname Jul 6 19:33:17 master-1 chronos: + element_in --hostname Jul 6 19:33:17 master-1 chronos: + local e Jul 6 19:33:17 master-1 chronos: + return 1 Jul 6 19:33:17 master-1 chronos: + case "$name" in Jul 6 19:33:17 master-1 chronos: + cmd+=("--$name" "$(< "$conf_dir/$name")") Jul 6 19:33:17 master-1 chronos: + read -u 9 -r -d '' path Jul 6 19:33:17 master-1 chronos: + local name=http_port Jul 6 19:33:17 master-1 chronos: + element_in --http_port Jul 6 19:33:17 master-1 chronos: + local e Jul 6 19:33:17 master-1 chronos: + return 1 Jul 6 19:33:17 master-1 chronos: + case "$name" in Jul 6 19:33:17 master-1 chronos: + cmd+=("--$name" "$(< "$conf_dir/$name")") Jul 6 19:33:17 master-1 chronos: + read -u 9 -r -d '' path Jul 6 19:33:17 master-1 chronos: + logged chronos run_jar --zk_hosts 10.10.3.65:2181 --master zk://10.10.3.65:2181/mesos --zk_path /chronos --hostname master-1 --http_port 8081 Jul 6 19:33:17 master-1 chronos: + local 'token=chronos[6064]' Jul 6 19:33:17 master-1 chronos: + shift Jul 6 19:33:17 master-1 chronos: + exec Jul 6 19:33:17 master-1 chronos: + exec Jul 6 19:33:18 master-1 chronos: ++ exec logger -p user.info -t 'chronos[6064]' Jul 6 19:33:18 master-1 chronos: ++ exec logger -p user.notice -t 'chronos[6064]' Jul 6 19:33:18 master-1 chronos[6064]: + run_jar --zk_hosts 10.10.3.65:2181 --master zk://10.10.3.65:2181/mesos --zk_path /chronos --hostname master-1 --http_port 8081 Jul 6 19:33:18 master-1 chronos[6064]: + local 'log_format=%2$s %5$s%6$s%n' Jul 6 19:33:18 master-1 chronos[6064]: ++ ulimit -n Jul 6 19:33:18 master-1 chronos[6064]: + '[' 0 -eq 0 -a 1024 -lt 8192 ']' Jul 6 19:33:18 master-1 chronos[6064]: + ulimit -n 8192 Jul 6 19:33:18 master-1 chronos[6064]: + export PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin Jul 6 19:33:18 master-1 chronos[6064]: + PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin Jul 6 19:33:18 master-1 chronos[6064]: + vm_opts=(-Djava.library.path=/usr/local/lib:/usr/lib64:/usr/lib -Djava.util.logging.SimpleFormatter.format="$log_format") Jul 6 19:33:18 master-1 chronos[6064]: + local vm_opts Jul 6 19:33:18 master-1 chronos[6064]: + for j_opt in '${JAVA_OPTS:-"-Xmx512m"}' Jul 6 19:33:18 master-1 chronos[6064]: + vm_opts+=(${j_opt}) Jul 6 19:33:18 master-1 chronos[6064]: + exec java -Djava.library.path=/usr/local/lib:/usr/lib64:/usr/lib '-Djava.util.logging.SimpleFormatter.format=%2$s %5$s%6$s%n' -Xmx512m -cp /usr/bin/chronos org.apache.mesos.chronos.scheduler.Main --zk_hosts 10.10.3.65:2181 --master zk://10.10.3.65:2181/mesos --zk_path /chronos --hostname master-1 --http_port 8081 Jul 6 19:33:18 master-1 chronos[6064]: [2015-07-06 19:33:18,314] INFO --------------------- (org.apache.mesos.chronos.scheduler.Main$:26) Jul 6 19:33:18 master-1 chronos[6064]: [2015-07-06 19:33:18,316] INFO Initializing chronos. (org.apache.mesos.chronos.scheduler.Main$:27) Jul 6 19:33:18 master-1 chronos[6064]: [2015-07-06 19:33:18,318] INFO --------------------- (org.apache.mesos.chronos.scheduler.Main$:28) Jul 6 19:33:20 master-1 chronos[6064]: [2015-07-06 19:33:20,512] INFO Wiring up the application (org.apache.mesos.chronos.scheduler.config.MainModule:38) Jul 6 19:33:20 master-1 chronos[6064]: # Jul 6 19:33:20 master-1 chronos[6064]: # A fatal error has been detected by the Java Runtime Environment: Jul 6 19:33:20 master-1 chronos[6064]: # Jul 6 19:33:20 master-1 chronos[6064]: # SIGSEGV (0xb) at pc=0x00007f7c54ddf56c, pid=6064, tid=140171988526848 Jul 6 19:33:20 master-1 chronos[6064]: # Jul 6 19:33:20 master-1 chronos[6064]: # JRE version: OpenJDK Runtime Environment (7.0_75-b13) (build 1.7.0_75-mockbuild_2015_01_21_05_53-b00) Jul 6 19:33:20 master-1 chronos[6064]: # Java VM: OpenJDK 64-Bit Server VM (24.75-b04 mixed mode linux-amd64 compressed oops) Jul 6 19:33:20 master-1 chronos[6064]: # Derivative: IcedTea 2.5.4 Jul 6 19:33:20 master-1 chronos[6064]: # Distribution: Built on CentOS Linux release 7.0.1406 (Core) (Wed Jan 21 05:53:48 UTC 2015) Jul 6 19:33:20 master-1 chronos[6064]: # Problematic frame: Jul 6 19:33:20 master-1 chronos[6064]: # C [libc.so.6+0x8056c] cfree+0x1c Jul 6 19:33:20 master-1 chronos[6064]: # Jul 6 19:33:20 master-1 chronos[6064]: # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again Jul 6 19:33:20 master-1 chronos[6064]: # Jul 6 19:33:20 master-1 chronos[6064]: # An error report file with more information is saved as: Jul 6 19:33:20 master-1 chronos[6064]: # /tmp/jvm-6064/hs_error.log Jul 6 19:33:20 master-1 chronos[6064]: # Jul 6 19:33:20 master-1 chronos[6064]: # If you would like to submit a bug report, please include Jul 6 19:33:20 master-1 chronos[6064]: # instructions on how to reproduce the bug and visit: Jul 6 19:33:20 master-1 chronos[6064]: # http://icedtea.classpath.org/bugzilla Jul 6 19:33:20 master-1 chronos[6064]: #
Here is the generated error description file : https://gist.github.com/WinstonSureChill/a17a344b091ea5ee7ede
This is the top of the stacktrace as found in the generated error file:
Stack: [0x00007f7c55856000,0x00007f7c55957000], sp=0x00007f7c55952808, free space=1010k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libc.so.6+0x8056c] cfree+0x1c [error occurred during error reporting (printing native stack), id 0xb] Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.mesos.state.ZooKeeperState.initialize(Ljava/lang/String;JLjava/util/concurrent/TimeUnit;Ljava/lang/String;)V+0 j org.apache.mesos.state.ZooKeeperState.<init>(Ljava/lang/String;JLjava/util/concurrent/TimeUnit;Ljava/lang/String;)V+11 j org.apache.mesos.chronos.scheduler.config.ZookeeperModule.provideState()Lorg/apache/mesos/state/State;+40 v ~StubRoutines::call_stub [...]
The last java calls seem to be Zookeeper related (file https://github.com/apache/mesos/blob/master/src/java/src/org/apache/mesos/state/ZooKeeperState.java), so I'm thinking maybe I have a problem with my zookeeper configuration? Or does someone see an obvious error in the parameters passed to chronos:
Jul 6 19:33:18 master-1 chronos[6064]: + run_jar --zk_hosts 10.10.3.65:2181 --master zk://10.10.3.65:2181/mesos --zk_path /chronos --hostname master-1 --http_port 8081
Here is my zookeeper config: https://gist.github.com/WinstonSureChill/b402d07f0bbffe9b035e
And the mesos config I setup with the config files:
mesos/
zk: zk://10.10.3.65:2181/mesos master: 10.10.3.65
zk://10.10.3.65:2181/mesos
10.10.3.65
mesos-master/
hostname: f1.linuxrt ip: 10.10.3.65 quorum: 1 work_dir: /var/lib/mesos
f1.linuxrt
1
/var/lib/mesos
Also, Mesos works fine on itself (without Chronos).
My problem looks like that one (marathon+mesos): https://github.com/mesosphere/marathon/issues/1352
I was using java openjdk 1.7. I upgraded to 1.8, reinstalled & reconfigured Zookeeper & Chronos, and now everything works fine.
Environment
OS: CentOS 7 chronos-2.3.4-1.0.81.el7.x86_64.rpm mesos-0.22.1-1.0.centos701406.x86_64.rpm mesosphere-zookeeper-3.4.6-0.1.20141204175332.centos7.x86_64.rpm
Bug
When starting chronos:
it crashes. Here are the logs:
Here is the generated error description file : https://gist.github.com/WinstonSureChill/a17a344b091ea5ee7ede
This is the top of the stacktrace as found in the generated error file:
The last java calls seem to be Zookeeper related (file https://github.com/apache/mesos/blob/master/src/java/src/org/apache/mesos/state/ZooKeeperState.java), so I'm thinking maybe I have a problem with my zookeeper configuration? Or does someone see an obvious error in the parameters passed to chronos:
Here is my zookeeper config: https://gist.github.com/WinstonSureChill/b402d07f0bbffe9b035e
And the mesos config I setup with the config files:
mesos/
zk:
zk://10.10.3.65:2181/mesos
master:10.10.3.65
mesos-master/
hostname:
f1.linuxrt
ip:10.10.3.65
quorum:1
work_dir:/var/lib/mesos
Also, Mesos works fine on itself (without Chronos).
Related issues
My problem looks like that one (marathon+mesos): https://github.com/mesosphere/marathon/issues/1352