Open kubaseai opened 6 years ago
The basic attach mechanism works. I verified by attaching and getting the system properties. This may be an issue with RMI.
@kubaseai Have you tested this using OpenJDK with Hotspot? Also, are you running jconsole in the docker container?
Hello Peter, I would like to describe some specific use case: 100 docker containers with Tibco BusinessWorks 5.13 (process server of Enterpise Service Bus) running on OpenJ9. BW exposes via JMX own bean to monitor running processes. In case of more that 100 containers running with replicas standard way of enabling JMX would require own TCP/IP port per container instance. With the patch from my first comment it is possible to do however not very efficient in administration. So I thought about specialized RMISocketFactory using files in /tmp directory.
Now back to the question. Sun Attach API implementation uses kill(pid) (as strace shows) in process of connecting to target JVM. It doesn't work from docker server into docker container because container's pid doesn't exist in server process space (or is totally different process) due to LXC isolation. IBM implementation uses semaphores, so with --ipc=host we've got workaround. The only remaining issue is definition of localhost for jconsole delivered by JVM in container.
OpenJ9 needs some small tweaks to be accessible with Attach API from docker server without the need for TCP/IP mapping. I'm very close to have working PoC.
JConsole sees stub: ��sr.javax.management.remote.rmi.RMIServerImpl_Stubxrjava.rmi.server.RemoteStub���ɋ�exrjava.rmi.server.RemoteObject�a�� a3xpw UnicastRef2 172.17.0.3sr-sun.management.jmxremote.FileRMISocketFactoryL acceptDirtLjava/lang/String;Lfbcst7Lsun/management/jmxremote/FileBasedCommunicationServer;xpt/tmp/.jmx/25@b336bb3cdfe7pwAga�w����b���[�x
Note OpenJ9 can only accept changes to OpenJ9 code. rmi issues should be addressed at OpenJDK, and OpenJ9 builds will pick up any OpenJDK updates.
Thanks for info. I will put current patches on my github, but for OpenJDK we need brand new implementation of Attach API and RMI working with docker shared volumes. In case of > 100 docker containers TCP/IP is hard to accept for JMX monitoring especially with replicas.
Please note also that attach API is not intended to allow attachments from foreign systems, such as other containers, virtual machines, or hosts.
attach API is not intended to allow attachments from foreign systems, such as other containers, virtual machines, or hosts.
@pdbain-ibm This Issue as a proposal to improve on this. Do you have any comments about the getOverridenLocalHostName() proposal or alternative suggestions?
I will take a look.
I've got working FileRMISocketFactory and I'm able to connect with jconsole using files in /tmp (volume). In case we've got many containers from the same image there is a huge probability that JVM process get the same pid for every container. I tried to set com.ibm.tools.attach.id per JVM, but when I set -Dcom.ibm.tools.attach.id=bw_time_1 jconsole doesn't see this VirtualMachine.
Nice work.
There are some complications.
The JVM cleans up the attach directory on launch by checking if each advertisement file corresponds to an active process. If there is no active process, it removes the associated file artifacts. VMs on foreign docker images won't be visible and will have their files erased. There is a process to re-create them but it is limited.
The VM ID generation scheme is based on process IDs. The algorithm for resolving collisions is designed for handling files left over from dead processes.
If you list the VMs using the attached code, what do you see? listvms.zip
I suggest we think about the intended use case and work from there. Do you want to be able to attach from a host machine to VMs in docker images? I think we can develop an algorithm which handles that case reliably.
Thanks for the compiled class. I set ID with -Dcom.ibm.tools.attach.id=container_123456789_bw_time. Checking for active process IPC.processExists(pid) can be extended to IPC.exists(pid, directoryEntryName) where directoryEntryName would be container_123456789_bw_time. From this string a docker container id can be extracted.
java -Dcom.ibm.tools.attach.logging=yes -Dcom.ibm.tools.attach.containerExistsCmd="echo 1" listvms looking for attach targets id: container_123456789_bw_time name: container_123456789_bw_time id: 16217 name: listvms listVms:exit
1525803415444 16217: 17 [Attach API initializer]: AttachHandler initialize 1525803415450 16217: 17 [Attach API initializer]: IPC Directory=/tmp/.com_ibm_tools_attach 1525803415452 16217: 17 [Attach API initializer]: createDirectoryAndSemaphore /tmp/.com_ibm_tools_attach 1525803415455 16217: 17 [Attach API initializer]: non-blocking locking file /tmp/.com_ibm_tools_attach/_master 1525803415460 16217: 17 [Attach API initializer]: deleteStaleDirectories checking container_123456789_bw_time 1525803415467 16217: 17 [Attach API initializer]: getPidFromFile pid = 27container_123456789_bw_time 1525803415468 16217: 17 [Attach API initializer]: getPidFromFile uid = 1000 1525803415468 16217: 17 [Attach API initializer]: deleteStaleDirectories checking _master 1525803415470 16217: 17 [Attach API initializer]: deleteStaleDirectories checking _notifier 1525803415470 16217: 17 [Attach API initializer]: deleteStaleDirectories checking _attachlock 1525803415471 16217: 17 [Attach API initializer]: AttachHandler obtained master lock 1525803415476 16217: 17 [Attach API initializer]: locking file /tmp/.com_ibm_tools_attach/_attachlock 1525803415481 16217: 17 [Attach API initializer]: createAdvertisementFile /tmp/.com_ibm_tools_attach/16217/attachInfo 1525803415482 16217: 17 [Attach API initializer]: unlocking file /tmp/.com_ibm_tools_attach/_attachlock 1525803415483 16217: 17 [Attach API initializer]: unlocking file /tmp/.com_ibm_tools_attach/_master 1525803415485 16217: 19 [Attach API wait loop]: iteration 0 waitForNotification ignoreNotification entering 1525803415486 16217: 19 [Attach API wait loop]: iteration 0 waitForNotification ignoreNotification entered 1525803415487 16217: 19 [Attach API wait loop]: iteration 0 waitForNotification starting wait 1525803415530 16217: 1 [main]: locking file /tmp/.com_ibm_tools_attach/_master 1525803415622 16217: 1 [main]: containerExists 123456789=true 1525803415624 16217: 1 [main]: unlocking file /tmp/.com_ibm_tools_attach/_master 1525803415643 16217: 18 [Attach API teardown]: shutting down attach API 1525803415644 16217: 18 [Attach API teardown]: AttachHandler terminate: Attach API is being shut down 1525803415645 16217: 18 [Attach API teardown]: AttachHandler terminate removing contents of directory : /tmp/.com_ibm_tools_attach/16217 1525803415646 16217: 18 [Attach API teardown]: deleting my files 1525803415649 16217: 18 [Attach API teardown]: non-blocking locking file /tmp/.com_ibm_tools_attach/_master 1525803415650 16217: 18 [Attach API teardown]: AttachHandler terminate obtained master lock 1525803415651 16217: 18 [Attach API teardown]: notifyVm 3 targets 1525803415652 16217: 18 [Attach API teardown]: unlocking file /tmp/.com_ibm_tools_attach/_master 1525803415653 16217: 18 [Attach API teardown]: AttachHandler terminate released master lock 1525803415653 16217: 19 [Attach API wait loop]: iteration 0 waitForNotification ended wait 1525803415654 16217: 19 [Attach API wait loop]: iteration 0 waitForNotification cancelNotify 1525803415655 16217: 18 [Attach API teardown]: deleting my directory 1525803415656 16217: 18 [Attach API teardown]: AttachHandler closed semaphore
I need to check JConsole code why it doesn't like my virtual machine with customized ID.
I want to be able to attach from host to VMs in docker containers. Maybe also from one dedicated container with exposed extended range of ports to other containers. Currently we have ServerSocket(0) and we can't expose all 64K ports to host.
Sun assumed vmid must be pid and doesn't support string in JConsole. Fixed. So I've got first docker friendly JDK.
https://dzone.com/articles/codetalk-red-hat-cto-on-jakarta-ee-cloud-native-ku I think we are waiting for big players to implement docker/container features. If someone wants to patch OpenJ9 on their own here is described concept: https://medium.com/@jakub.jozwicki/docker-friendly-enterprise-java-51cac8417af8.
I recently tried OpenJDK build for Java 11 and it seems you can connect to a JVM running inside a container from the host using attach api. There are couple of issue related to that in openjdk: https://bugs.openjdk.java.net/browse/JDK-8179498 https://bugs.openjdk.java.net/browse/JDK-8193710
I think it would be good to have this kind of support in OpenJ9 as well, as it would help in JVM monitoring in cloud environments.
Currently for a single docker container running an OpenJ9 JVM, we can use attach api if we start the docker container with --network=host --ipc=host
options and bind mount host's /tmp
directory.
I think --ipc=host
option would any be required to allow container to use host system’s IPC namespace for the semaphore used by attach api. But --network=host
is considered insecure as per https://docs.docker.com/engine/reference/run/#network-settings
Note: --network="host" gives the container full access to local system services such as D-bus and is therefore considered insecure.
I think providing complete attach api support for JVMs in containers would require re-looking at different aspects - advertisement, discovery and communication.
I did code review of attach api and tried to see the problems that can arise in connecting to a JVM running in container from the host. @kubaseai already covered many of these in the comments above. Summarizing the issues here:
1) Current discovery mechanism relies on common directory (by default it is /tmp/.com_ibm_tools_attach
) accessible to both target JVM and client JVM.
During startup target JVM would have created a directory using a PID as the name under this common directory and advertised its details in a file /tmp/.com_ibm_tools_attach/PID/attachInfo
file. Client would iterate through all the entries in common directory and read the attachInfo
file to get list of the target JVMs.
This would not work in containers as the filesystem of containers is different than the host, unless containers bind mount a host's directory and JVM use that as the common directory.
Even with bind mount, there is another problem with the use of PID as the VM id which is used for creating the directory for the advertisement file.
Multiple containers may be running on the host, in which case JVMs inside different containers may have same PID and hence same VM id. TargetDirectory.createMyDirectory()
actually takes care of this situation by appending a counter to the PID to generate a unique VM id (of the form PID_
2) Next problem comes from the client side. While discovering all the VMs in a common directory (as in AttachProvider.listVirtualMachines()
), the client gets the PID of the target from the advertisement file and checks if the process exists or not. Because of different PID namespace, PID of the JVM in the container would not be same as PID of the JVM in the host. We need some mechanism to map PID in the container to PID in the host.
3) Next problem is closely similar to previous one. To attach to a specific JVM, client is provided the VM id of the target JVM. The client first gets the list of all JVMs and then uses this VM id to filter out the target JVM. As stated before, by default the VM id is the PID of the JVM. Now, if the client uses PID of the target JVM on the host as the VM id, then it won't be able to locate that JVM in the list since its VM id is its PID in the container.
4) Lastly, be default host and container use different network namespace. So using InetAddress.getLoopbackAddress()
for talking to JVM in container would not work.
5) AttachAPI is using semaphores to send notification to the target JVM(s) when a client wants to connect to them. Again, by default container and host have different IPC namespace, so this mechanism won't work for JVMs running in container and host.
Workarounds/Solutions
There are some workarounds to handle these problems like using --netowrk=host
docker option to fix problem 4 or using bind mounts, but they don't address problems 2 and 3.
There is actually a way to look into container's file system by using /proc/PID/root
which is a symbolic link to the process's root directory. So one way to handle these problems is to scan the entries in /proc
fs and use /proc/PID/root/
instead of /
as the root of the common directory to discover JVMs running inside a container. This way no bind mounting is required. Also the presence of its entry in the /proc
should be enough to conclude the process is active and running, and the client wouldn't need to check for target's process existence explicitly.
To handle problem 4, I think the changes in comment to add a new property com.ibm.tools.attach.target.hostname
and using InetAddress.getByName()
make sense.
For problem 5, the user would have to start the docker container using --ipc=host
to share the IPC namespace between the host and the container, which should not be much of a concern.
root@user-Aspire-ES1-431:~# docker run -d -p 8080:8080 -e DOCKER_HOST_IP=192.168.100.132 -e IBM_JAVA_OPTIONS="-Dcom.ibm.tools.attach.logging=yes" --ipc=host --net=host -v /tmp:/tmp kubaseai/bw-time
I'm able to connect with jconsole using Attach API (/tmp directory + semaphore + TCP/IP).
When I remove --net=host JVM inside has got different meaning of localhost than jconsole. I guess that something like this should help:
root@user-Aspire-ES1-431:~# docker run -d -p 8080:8080 -p 10200:10200 -e DOCKER_HOST_IP=192.168.100.132 -e IBM_JAVA_OPTIONS="-Dcom.ibm.tools.attach.logging=yes -Djava.rmi.server.port=10200 -Dcom.sun.management.jmxremote.local.only=false" --ipc=host -v /tmp:/tmp -v /root/tmp/openj9-openjdk-jdk9/build/linux-x86_64-normal-server-release/images/jdk:/opt/java/openjdk/jdk-9 kubaseai/bw-time 3a6e21ca59194231cc5e31d2b291e6f6b03ce3e6b950a12d65c83394d4bd029e
Returned RMI address is of stub form: service:jmx:rmi://127.0.0.1/stub/rO0ABXNyAC5qYXZheC5tYW5hZ2VtZW50LnJlbW90ZS5ybWkuUk1JU2VydmVySW1wbF9TdHViAAAAAAAAAAICAAB4cgAaamF2YS5ybWkuc2VydmVyLlJlbW90ZVN0dWLp/tzJi+FlGgIAAHhyABxqYXZhLnJtaS5zZXJ2ZXIuUmVtb3RlT2JqZWN002G0kQxhMx4DAAB4cHc4AApVbmljYXN0UmVmAA8xOTIuMTY4LjEwMC4xMzIAAKjrq6UxzCqdRssPFdVZAAABYtC1HbGAAgB4
After base64 -d: ��sr.javax.management.remote.rmi.RMIServerImpl_Stubxrjava.rmi.server.RemoteStub���ɋ�exrjava.rmi.server.RemoteObject�a�� a3xpw8 UnicastRef192.168.100.132�뫥1�*�F��Ybе��x
Would it be possible to pass object implementing specific access?