Closed Makman2 closed 6 years ago
Currently this is the problem:
/bin/sh: 1: flink/bin/mesos-taskmanager.sh: not found
It seems the working directory is wrong, as the sandbox is correctly populated with all needed files, and flink/bin/mesos-taskmanager.sh
exists definitely.
We are also hitting this when trying to use the mesos docker mode:
-Dmesos.resourcemanager.tasks.container.type=docker
and the default container image.
I0424 17:43:40.646013 17393 fetcher.cpp:533] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/root","items":[{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c459-stop-zooke_-quorum.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/stop-zookeeper-quorum.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/stop-zookeeper-quorum.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c464-log4j-cli.properties","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/log4j-cli.properties","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/log4j-cli.properties"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c470-logback.xml","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/logback.xml","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/logback.xml"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c480-flink-metr_-1.3.2.jar","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/lib\/flink-metrics-statsd-1.3.2.jar","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/lib\/flink-metrics-statsd-1.3.2.jar"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c462-zookeeper.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/zookeeper.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/zookeeper.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c460-taskmanager.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/taskmanager.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/taskmanager.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c449-mesos-taskmanager.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/mesos-taskmanager.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/mesos-taskmanager.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c458-stop-local.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/stop-local.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/stop-local.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c475-flink-dist_-1.3.2.jar","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/lib\/flink-dist_2.11-1.3.2.jar","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/lib\/flink-dist_2.11-1.3.2.jar"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c457-stop-cluster.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/stop-cluster.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/stop-cluster.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c476-flink-pyth_-1.3.2.jar","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/lib\/flink-python_2.11-1.3.2.jar","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/lib\/flink-python_2.11-1.3.2.jar"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c456-start-zook_-quorum.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/start-zookeeper-quorum.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/start-zookeeper-quorum.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c455-start-scala-shell.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/start-scala-shell.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/start-scala-shell.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c453-start-local.bat","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/start-local.bat","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/start-local.bat"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c446-historyserver.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/historyserver.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/historyserver.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c452-start-cluster.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/start-cluster.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/start-cluster.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c471-masters","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/masters","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/masters"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c442-flink","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/flink","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/flink"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c465-log4j-cons_properties","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/log4j-console.properties","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/log4j-console.properties"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c467-log4j.properties","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/log4j.properties","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/log4j.properties"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c468-logback-console.xml","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/logback-console.xml","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/logback-console.xml"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c469-logback-yarn.xml","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/logback-yarn.xml","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/logback-yarn.xml"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c466-log4j-yarn_properties","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/log4j-yarn-session.properties","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/log4j-yarn-session.properties"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c474-log4j.propertiese","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/log4j.propertiese","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/log4j.propertiese"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c472-slaves","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/slaves","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/slaves"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c478-log4j-1.2.17.jar","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/lib\/log4j-1.2.17.jar","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/lib\/log4j-1.2.17.jar"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c473-zoo.cfg","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/zoo.cfg","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/zoo.cfg"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c479-slf4j-log4_-1.7.7.jar","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/lib\/slf4j-log4j12-1.7.7.jar","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/lib\/slf4j-log4j12-1.7.7.jar"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c463-flink-conf.yaml","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/conf\/flink-conf.yaml","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/conf\/flink-conf.yaml"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c445-flink.bat","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/flink.bat","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/flink.bat"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c450-pyflink.bat","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/pyflink.bat","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/pyflink.bat"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c441-config.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/config.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/config.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c447-jobmanager.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/jobmanager.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/jobmanager.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c477-flink-shad_-1.3.2.jar","uri":{"cache":true,"executable":false,"extract":false,"output_file":"flink\/lib\/flink-shaded-hadoop2-uber-1.3.2.jar","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/lib\/flink-shaded-hadoop2-uber-1.3.2.jar"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c448-mesos-appmaster.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/mesos-appmaster.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/mesos-appmaster.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c443-flink-console.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/flink-console.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/flink-console.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c461-yarn-session.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/yarn-session.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/yarn-session.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c444-flink-daemon.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/flink-daemon.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/flink-daemon.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c451-pyflink.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/pyflink.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/pyflink.sh"}},{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c454-start-local.sh","uri":{"cache":true,"executable":true,"extract":false,"output_file":"flink\/bin\/start-local.sh","value":"http:\/\/my-host:32887\/2a481d2e-3130-4386-8b6f-176247f8ba92\/flink\/bin\/start-local.sh"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/01354a28-9c65-4f29-8218-b4e5401d5801-S2\/frameworks\/6ab6b470-559c-4aab-8a23-1083cc7ca62c-0000\/executors\/taskmanager-00086\/runs\/e4819666-63fe-4a32-9356-34c7909d093c","user":"root"}
I0424 17:43:40.652581 17393 fetcher.cpp:444] Fetching URI 'http://my-host:32887/2a481d2e-3130-4386-8b6f-176247f8ba92/flink/bin/stop-zookeeper-quorum.sh'
I0424 17:43:40.652626 17393 fetcher.cpp:341] Fetching from cache
I0424 17:43:40.655618 17393 fetcher.cpp:207] Copied resource '/tmp/mesos/fetch/root/c459-stop-zooke_-quorum.sh' to '/var/lib/mesos/slave/slaves/01354a28-9c65-4f29-8218-b4e5401d5801-S2/frameworks/6ab6b470-559c-4aab-8a23-1083cc7ca62c-0000/executors/taskmanager-00086/runs/e4819666-63fe-4a32-9356-34c7909d093c/flink/bin/stop-zookeeper-quorum.sh'
I0424 17:43:40.655681 17393 fetcher.cpp:582] Fetched 'http://my-host:32887/2a481d2e-3130-4386-8b6f-176247f8ba92/flink/bin/stop-zookeeper-quorum.sh' to '/var/lib/mesos/slave/slaves/01354a28-9c65-4f29-8218-b4e5401d5801-S2/frameworks/6ab6b470-559c-4aab-8a23-1083cc7ca62c-0000/executors/taskmanager-00086/runs/e4819666-63fe-4a32-9356-34c7909d093c/flink/bin/stop-zookeeper-quorum.sh'
...
I0424 17:43:40.852995 17393 fetcher.cpp:341] Fetching from cache
I0424 17:43:40.855098 17393 fetcher.cpp:207] Copied resource '/tmp/mesos/fetch/root/c451-pyflink.sh' to '/var/lib/mesos/slave/slaves/01354a28-9c65-4f29-8218-b4e5401d5801-S2/frameworks/6ab6b470-559c-4aab-8a23-1083cc7ca62c-0000/executors/taskmanager-00086/runs/e4819666-63fe-4a32-9356-34c7909d093c/flink/bin/pyflink.sh'
I0424 17:43:40.855152 17393 fetcher.cpp:582] Fetched 'http://my-host:32887/2a481d2e-3130-4386-8b6f-176247f8ba92/flink/bin/pyflink.sh' to '/var/lib/mesos/slave/slaves/01354a28-9c65-4f29-8218-b4e5401d5801-S2/frameworks/6ab6b470-559c-4aab-8a23-1083cc7ca62c-0000/executors/taskmanager-00086/runs/e4819666-63fe-4a32-9356-34c7909d093c/flink/bin/pyflink.sh'
I0424 17:43:40.855163 17393 fetcher.cpp:444] Fetching URI 'http://my-host:32887/2a481d2e-3130-4386-8b6f-176247f8ba92/flink/bin/start-local.sh'
I0424 17:43:40.855175 17393 fetcher.cpp:341] Fetching from cache
I0424 17:43:40.857486 17393 fetcher.cpp:207] Copied resource '/tmp/mesos/fetch/root/c454-start-local.sh' to '/var/lib/mesos/slave/slaves/01354a28-9c65-4f29-8218-b4e5401d5801-S2/frameworks/6ab6b470-559c-4aab-8a23-1083cc7ca62c-0000/executors/taskmanager-00086/runs/e4819666-63fe-4a32-9356-34c7909d093c/flink/bin/start-local.sh'
I0424 17:43:40.857542 17393 fetcher.cpp:582] Fetched 'http://my-host:32887/2a481d2e-3130-4386-8b6f-176247f8ba92/flink/bin/start-local.sh' to '/var/lib/mesos/slave/slaves/01354a28-9c65-4f29-8218-b4e5401d5801-S2/frameworks/6ab6b470-559c-4aab-8a23-1083cc7ca62c-0000/executors/taskmanager-00086/runs/e4819666-63fe-4a32-9356-34c7909d093c/flink/bin/start-local.sh'
I0424 17:43:41.259465 17572 exec.cpp:162] Version: 1.4.2
I0424 17:43:41.264185 17595 exec.cpp:236] Executor registered on agent 01354a28-9c65-4f29-8218-b4e5401d5801-S2
I0424 17:43:41.265388 17604 executor.cpp:120] Registered docker executor on 10.1.10.19
I0424 17:43:41.265950 17597 executor.cpp:160] Starting task taskmanager-00086
/bin/sh: 1: flink/bin/mesos-taskmanager.sh: not found
I0424 17:43:43.164495 17605 process.cpp:1068] Failed to accept socket: future discarded
Same observation as above, it is copying files from tmp to the sandbox but those files do not seem to be mounted in the docker container and the root path be used correctly.
Using DCOS 1.10 and dcos-flink-service 1.3.1-1.2.1
I see this issue was closed. What was the resolution? @joerg84 @Makman2
Due to inactivity, migrated to #54
Especially this means that something is not done completely right in the TaskManager spawning routines. Also users should have the choice to select their favorite containerizer.