TOSIT-IO / tdp-collection

Ansible collection to deploy the components of TDP
Apache License 2.0
21 stars 19 forks source link

Ranger Yarn Plugin Config prevent Yarn NM to start #750

Closed ChristopheLRTE closed 6 months ago

ChristopheLRTE commented 1 year ago

Hello,

It seems that installing Ranger cause an issue with yarn_nm service.

Indeed, here are the error log :

2023-04-25 18:55:02,076 WARN  privileged.PrivilegedOperationExecutor (PrivilegedOperationExecutor.java:executePrivilegedOperation(174)) - Shell execution returned exit code: 24. Privileged Execution Operation Stderr: 
Configuration file ../etc/hadoop/container-executor.cfg not found.

Stdout: 
Full command array for failed execution: 
[/opt/tdp/hadoop-3.1.1-TDP-0.1.0-SNAPSHOT/bin/container-executor, --checksetup]
2023-04-25 18:55:02,077 WARN  nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:init(304)) - Exit code from container executor initialization is : 24
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.

        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:206)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:300)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:389)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:929)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:997)
Caused by: ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009)
        at org.apache.hadoop.util.Shell.run(Shell.java:902)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
        ... 6 more
2023-04-25 18:55:02,080 INFO  service.AbstractService (AbstractService.java:noteFailure(267)) - Service NodeManager failed in state INITED
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:929)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:997)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:307)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:389)
        ... 3 more
Caused by: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.

        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:206)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:300)
        ... 4 more
Caused by: ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009)
        at org.apache.hadoop.util.Shell.run(Shell.java:902)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
        ... 6 more
2023-04-25 18:55:02,081 ERROR nodemanager.NodeManager (NodeManager.java:initAndStartNodeManager(932)) - Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:391)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:929)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:997)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:307)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:389)
        ... 3 more
Caused by: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.

        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:206)
        at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:300)
        ... 4 more
Caused by: ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009)
        at org.apache.hadoop.util.Shell.run(Shell.java:902)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
        ... 6 more
2023-04-25 18:55:02,083 INFO  nodemanager.NodeManager (LogAdapter.java:info(51)) - SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at bdpnode02.applispfref.sipfref.local/10.132.151.217

The yarn nm service intend to find container-executor.cfg in /etc/hadoop/conf (linked by /opt/tdp/hadoop-3.1.1-TDP-0.1.0-SNAPSHOT/etc/hadoop symlink) folder, whereas after Yarn plugin Ranger config Task (show this : Create symbolic link from etc/hadoop in {{ hadoop_install_dir }} to actual Resourcemanager config dir) finished, it updates the path to /etc/hadoop/conf.rm

As a consequence the command /opt/tdp/hadoop-3.1.1-TDP-0.1.0-SNAPSHOT/bin/container-executor --checksetup does not find anymore the file container-executor.cfg inside the good folder.

Am I wrong or not, please ?

Thank you very much for your advice, C.L.

ChristopheLRTE commented 1 year ago

Hello !

Could someone help me please ? This behaviour is reproducible 😞

Thanks C.L.

rpignolet commented 6 months ago

837 fix the symlink for HDFS and YARN.