Closed amoskong closed 4 years ago
This blocked nonroot install testing. /CC @slivne @roydahan
I tried to change scylla-jmx path in nonroot.conf:ExecStart to /home/scylla-test/install_root/jmx/symlinks/scylla-jmx
, but it still failed.
● scylla-jmx.service - Scylla JMX
Loaded: loaded (/home/scylla-test/.config/systemd/user/../../../install_root/etc/systemd/scylla-jmx.service; linked; vendor preset: enabled)
Drop-In: /home/scylla-test/.config/systemd/user/scylla-jmx.service.d
└─nonroot.conf
Active: failed (Result: exit-code) since Tue 2020-08-25 22:54:07 UTC; 9s ago
Process: 2567 ExecStart=/home/scylla-test/install_root/jmx/symlinks/scylla-jmx $SCYLLA_JMX_PORT $SCYLLA_API_PORT $SCYLLA_API_ADDR $SCYLLA_JMX_ADDR $SCYLLA_JMX_FILE $SCYLLA_JMX_LOCAL $SCYLLA_JMX_REMOTE $SCYLLA_JMX_DEBUG (code=exited, status=200/CHDIR)
Main PID: 2567 (code=exited, status=200/CHDIR)
Aug 25 22:54:07 artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 systemd[1620]: Started Scylla JMX.
Aug 25 22:54:07 artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 systemd[2567]: scylla-jmx.service: Changing to the requested working directory failed: No such file or directory
Aug 25 22:54:07 artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 systemd[2567]: scylla-jmx.service: Failed at step CHDIR spawning /home/scylla-test/install_root/jmx/symlinks/scylla-jmx: No such file or directory
Aug 25 22:54:07 artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 systemd[1620]: scylla-jmx.service: Main process exited, code=exited, status=200/CHDIR
Aug 25 22:54:07 artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 systemd[1620]: scylla-jmx.service: Failed with result 'exit-code'.
[scylla-test@artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 ~]$ ls -l /home/scylla-test/install_root/jmx/symlinks/scylla-jmx
lrwxrwxrwx. 1 scylla-test scylla-test 13 Aug 25 08:25 /home/scylla-test/install_root/jmx/symlinks/scylla-jmx -> /usr/bin/java
[scylla-test@artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 ~]$ /home/scylla-test/install_root/jmx/symlinks/scylla-jmx -version
openjdk version "1.8.0_262"
OpenJDK Runtime Environment (build 1.8.0_262-b10)
OpenJDK 64-Bit Server VM (build 25.262-b10, mixed mode)
However, I can successfully start scylla-jmx from cmdline:
[scylla-test@artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 ~]$ /home/scylla-test/install_root/jmx/symlinks/scylla-jmx -Xmx256m -XX:+UseSerialGC -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.host=localhost -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=7199 -Djava.rmi.server.hostname=localhost -Dcom.sun.management.jmxremote.rmi.port=7199 -Djavax.management.builder.initial=com.scylladb.jmx.utils.APIBuilder -jar /home/scylla-test/install_root/jmx/scylla-jmx-1.0.jar
Connecting to http://localhost:10000
Starting the JMX server
JMX is enabled to receive remote connections on port: 7199
[scylla-test@artifacts-centos8-jenkins-db-node-b7a4fdf8-0-1 ~]$ install_root/share/cassandra/bin/nodetool status
Using /home/scylla-test/install_root/etc/scylla/scylla.yaml as the config file
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 127.0.0.1 981.38 KB 256 ? 379f7230-d7a6-4f8d-bce9-d1bc852e5389 rack1
Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
This issue is solved by setting WorkingDirectory to empty in nonroot.conf I posted a fix: https://github.com/scylladb/scylla-jmx/pull/131
@amoskong Setting WorkingDirectory was for writing heap dump for the directory: https://github.com/scylladb/scylla-jmx/commit/be8f1ac511416e1d8997f27a68fa17f5131debda Where heap dump goes when WorkingDirectory is empty?
@amoskong Setting WorkingDirectory was for writing heap dump for the directory: be8f1ac Where heap dump goes when WorkingDirectory is empty?
Coredump setup requires privilege, we can't do that for nonroot install.
@amoskong Setting WorkingDirectory was for writing heap dump for the directory: be8f1ac Where heap dump goes when WorkingDirectory is empty?
We can also set the WorkingDirectory to $prefix/ or $prefix/var/lib/scylla
Currently $prefix/var/lib/scylla
directory won't be created after installation.
Coredump setup requires privilege, we can't do that for nonroot install.
@amoskong Setting WorkingDirectory was for writing heap dump for the directory: be8f1ac Where heap dump goes when WorkingDirectory is empty?
Coredump setup requires privilege, we can't do that for nonroot install.
No, I mean JVM heap dump, not coredump of native code that handled by Linux kernel. Related scylladb/scylla-enterprise#1469
On the issue we found that without WorkingDirectory we running scylla-jmx.service at PWD="/", so when JVM tries to write heap dump it caused "Permission denied", so we changed the default WorkingDirectory to /var/lib/scylla: https://github.com/scylladb/scylla-jmx/commit/be8f1ac511416e1d8997f27a68fa17f5131debda
My question is, on --user mode with WorkingDirectory="", does JVM has enough permission to write the dump, and where is it? If it's $HOME, it should be okay I think
@amoskong Setting WorkingDirectory was for writing heap dump for the directory: be8f1ac Where heap dump goes when WorkingDirectory is empty?
We can also set the WorkingDirectory to $prefix/ or $prefix/var/lib/scylla Currently
$prefix/var/lib/scylla
directory won't be created after installation.
Right.
Currently
$prefix/var/lib/scylla
directory won't be created after installation.
That's because we uses $prefix as the data directory on nonroot mode: https://github.com/scylladb/scylla/blob/master/install.sh#L374 So scylla will create commitlog/ data/ hints/ view_hints/ on $prefix (if it's doesn't working like that, should be a bug
version: unified-package-0.20200824.9636a3399.tar.gz
Install steps:
pwd
/install_rootstart scylla
systemctl --user start scylla-server
systemctl --user status scylla-jmx -f |less
/CC @syuu1228 @roydahan