linkedin / dr-elephant

Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Apache License 2.0
1.35k stars 859 forks source link

./compile.sh: line 53: play: command not found #700

Open vempadh opened 4 years ago

vempadh commented 4 years ago

[root@azitshdpmst01p dr-elephant-master]# ./compile.sh Checking for required programs... Program requirement is fulfilled! Using the default configuration Hadoop Version : 2.3.0 Spark Version : 1.4.0 Other opts set : ############################################################################ npm installation found, we'll compile with the new user interface ############################################################################

Additional error details: Since bower is a user command, there is no need to execute it with superuser permissions. If you're having permission errors when using bower without sudo, please spend a few minutes learning more about how your system should work and make any necessary repairs.

http://www.joyent.com/blog/installing-node-and-npm https://gist.github.com/isaacs/579814

You can however run a command with sudo using "--allow-root" option

vempadh commented 4 years ago

Hi Guys, Please help me on this. When i try to run ./compile.sh and getting error with Play command.

Run the main command alongwith the extra commands passed as arguments to compile.sh

echo "Command is: play $OPTS clean compile test $extra_commands" play_command $OPTS clean compile test $extra_commands if [ $? -ne 0 ]; then echo "Build failed..." exit 1; fi

nelhaj commented 4 years ago

Hi,

You must have play or activator command installed. Please follow Dr-Elephant Setup Instructions : Quick-Setup-Instructions-(Must-Read)

PS : duplicated issue : 44

vempadh commented 4 years ago

Hi, Play command is working after installation. But i was facing different issue after installation of Play.

[info] Compiling 98 Scala sources and 160 Java sources to /usr/local/dr-elephant-master/target/scala-2.10/classes... [warn] /usr/local/dr-elephant-master/app/org/apache/spark/deploy/history/SparkDataCollection.scala:313: abstract type pattern T is unchecked since it is eliminated by erasure [warn] seq.foreach { case (item: T) => list.add(item)} [warn] ^ [warn] one warning found java.io.IOException: Cannot run program "javac" (in directory "/usr/local/dr-elephant-master"): error=2, No such file or directory

Can you please help me on this.

vempadh commented 4 years ago

[error] (compile:compileIncremental) java.io.IOException: Cannot run program "javac" (in directory "/usr/local/dr-elephant-master"): error=2, No such file or directory [error] Total time: 22 s, completed Sep 26, 2020 3:51:53 AM Build failed..

vempadh commented 4 years ago

Can anyone help me on this. When i running compile.sh script and throwing below error in SparkDataCollection file in below path and error.

[warn] /usr/local/dr-elephant-master/app/org/apache/spark/deploy/history/SparkDataCollection.scala:313: abstract type pattern T is unchecked since it is eliminated by erasure [warn] seq.foreach { case (item: T) => list.add(item)} [warn] ^ [warn] one warning found java.io.IOException: Cannot run program "javac" (in directory "/usr/local/dr-elephant-master"): error=2, No such file or directory

ShubhamGupta29 commented 4 years ago

@vempadh the real issue is

java.io.IOException: Cannot run program "javac" (in directory "/usr/local/dr-elephant-master"): error=2, No such file or directory

For SparkDataCollection, there is just a warning and nothing else. I have never faced this issue so not sure how to resolve this issue. From the message seems like Java is not available in the path. Kindly check by executing any simple Java script.

vempadh commented 4 years ago

@ShubhamGupta29 : Thanks for your replay.

We resolved the issue and started Dr.Elephant and can be see the Web UI as well. But we didn't find any jobs in UI level. How do we fetch Spark and Hive Jobs in Elephant web UI?

Any suggestions on this.

vempadh commented 4 years ago

I was facing below error because of unable to see the Spark and Hive jobs in UI.

2020-09-29 09:00:33,745 - [ERROR] - from play.nettyException in New I/O worker #7 Exception caught in Netty java.lang.IllegalArgumentException: invalid version format: H￲ᆪテ&>Aᅢ￰L뫼 ᅧPノ;}ᅵEWテ￲*)+ᅯ￷ᄅDテ>Eニワ(E￐￷E￞ ￀+￀/￀,￀0ᅩ로ᄄ￀￀ワン/5モ▒ at org.jboss.netty.handler.codec.http.HttpVersion.(HttpVersion.java:102) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:75) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:189) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:101) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) ~[io.netty.netty-3.8.0.Final.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_262] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_262] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_262]

vempadh commented 4 years ago

Hi All, Can you please help me on the below issue.

09-29-2020 12:40:51 ERROR [dr-el-executor-thread-0] com.linkedin.drelephant.ElephantRunner : Failed to analyze TEZ application_1599981649784_4661 java.io.FileNotFoundException: http://azitshdpmst03p.ecolab.com:8188/ws/v1/timeline/TEZ_APPLICATION/tez_application_1599981649784_4661

Caused by: java.net.UnknownHostException: null

ShubhamGupta29 commented 4 years ago

@vempadh seems like the endpoint which Dr.Elephant is trying to call is not available, kindly check if some information exists at the link which you provided above as the failure reason.

ShubhamGupta29 commented 4 years ago

I was facing below error because of unable to see the Spark and Hive jobs in UI.

2020-09-29 09:00:33,745 - [ERROR] - from play.nettyException in New I/O worker #7 Exception caught in Netty java.lang.IllegalArgumentException: invalid version format: H￲ᆪテ&>Aᅢ￰L뫼 ᅧPノ;}ᅵEWテ￲*)+ᅯ￷ᄅDテ>Eニワ(E￐￷E￞ ￀+￀/￀,￀0ᅩ로ᄄ￀￀ワン/5モ▒ at org.jboss.netty.handler.codec.http.HttpVersion.(HttpVersion.java:102) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:75) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:189) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:101) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) ~[io.netty.netty-3.8.0.Final.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_262] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_262] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_262]

What was the reason for this issue and how did you resolve it? It would be good if you can mention details so if some other user faces this issue than would be unblocked by your resolution. Thanks.

vempadh commented 4 years ago

@ShubhamGupta29 : Still not yet resolved this issue. Can you please help me on this. I verified all github links and didn't found any solution for this.

[error] p.nettyException - Exception caught in Netty java.lang.IllegalArgumentException: invalid version format: ¥Vᄡᅡラ'Nᄎマ■ᅣ9▒%ᅵᆬVラᅦ"￀+￀/￀,￀0ᅩ로ᄄ￀￀ワン/5 at org.jboss.netty.handler.codec.http.HttpVersion.(HttpVersion.java:102) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:75) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:189) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:101) ~[io.netty.netty-3.8.0.Final.jar:na] at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500) ~[io.netty.netty-3.8.0.Final.jar:na] [error] p.nettyException - Exception caught in Netty java.lang.IllegalArgumentException: invalid version format: ᅨ;ᅧ○¥ᄅ궈AK2�/"ᄎᄎ￀+￀/￀,￀0ᅩ로ᄄ￀￀ワン/5

ShubhamGupta29 commented 4 years ago

@vempadh check if you are calling the Dr.Elephant's REST endpoint have HTTPs? If yes then try without HTTPs

vempadh commented 4 years ago

@ShubhamGupta29 : I have login Dr.elephant web UI through http only.
Dr.elephant is not working through https .

Can you please provide more information about Dr.Elephant REST and do you need to made any configuration changes in dr.elephant.

ShubhamGupta29 commented 4 years ago

Kindly add dr-elephant/logs/application.log and dr-elephant/dr.log here. Also mention the changes you made to Dr.Elephant, especially if you changed the Play version. I am still of opinion that Dr.Elephant is getting requests with HTTPS instead of HTTP only.

vempadh commented 4 years ago

@ShubhamGupta29 : Attached requested logs. i didn't made any changes in HTTPS and HTTP in dr.elephant. dr.log application.log

vempadh commented 4 years ago

image

ShubhamGupta29 commented 4 years ago

So your UI is loading, so you don't need to concern for the Netty logs for now. The main issue is that there are not jobs or applications shown in Dr.Elephant. Check the logs at the location ../dr-elephant/logs/elephant/dr_elephant.log. In case Dr.Elephant is processing applications then you would see logs of pattern: Analysis of application_12345565_4546456 took N.

vempadh commented 4 years ago

@ShubhamGupta29 : Thank you! Yes, Dr.Elephant is processing applications. But we are facing this issue : Failed to analyze SPARK application_1599981649784_3819

I have attached the dr-elephant.lof file. dr_elephant.log

ShubhamGupta29 commented 4 years ago

@vempadh quite evident that Dr.E could not get the Spark History Url from SPARK_HOME or SPARK_CONF_DIR (if properly set). Kindly check for spark.yarn.historyServer.address in the Spark configs. Also for the TEZ url mentioned in the logs, check why it does not exist.

vempadh commented 4 years ago

@ShubhamGupta29 : The mentioned configuration parameter configured to spark-defaults.conf file in below.

spark.yarn.historyServer.address azitshdpgtw01p.ecolab.com:18081

Is it correct?

vempadh commented 4 years ago

@ShubhamGupta29 : Yes, i have set properly in both Spark_Home & Spark_Conf_Dir. [root@azitshdpmst01p ~]# echo $SPARK_CONF_DIR /etc/spark2/conf [root@azitshdpmst01p ~]# echo $SPARK_HOME /usr/hdp/current/spark2-client

I have attached few xml files and fetcher xml file as well.

XML files.zip

vempadh commented 4 years ago

image

vempadh commented 4 years ago

@shahrukhkhan489 : We are using timeline service version :1.5

vempadh commented 4 years ago

@ShubhamGupta29 : In yarn, we configured the below parameters. image

Can you please help me on this