linkedin / dr-elephant

Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Apache License 2.0
1.35k stars 859 forks source link

Compiling with centos 7, hadoop 2.7.3 andd spark 1.6.2 doesn't work #435

Open abishek-sampath opened 6 years ago

abishek-sampath commented 6 years ago

I had used compile.conf with hadoop version as 2.7.3 and spark version 1.6.1. Compilation fails because of some tests failing.

SparkUtilsTest:
[info] SparkUtils
[info]   .fileSystemAndPathForEventLogDir
[info]   - returns a filesystem + path based on uri from fetcherConfg *** FAILED ***
[info]     java.lang.IllegalArgumentException: java.net.UnknownHostException: nn1.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:63)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$4$$anon$1.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:41)
[info]     ...
[info]     Cause: java.net.UnknownHostException: nn1.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:63)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$4$$anon$1.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:41)
[info]     ...
[info]   - returns a webhdfs filesystem + path based on spark.eventLog.dir when it is a webhdfs URL *** FAILED ***
[info]     java.lang.IllegalArgumentException: java.net.UnknownHostException: nn1.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:68)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$5$$anon$2.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:57)
[info]     ...
[info]     Cause: java.net.UnknownHostException: nn1.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:68)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$5$$anon$2.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:57)
[info]     ...
[info]   - returns a webhdfs filesystem + path based on spark.eventLog.dir when it is an hdfs URL *** FAILED ***
[info]     java.lang.IllegalArgumentException: java.net.UnknownHostException: nn1.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:70)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$6$$anon$3.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:71)
[info]     ...
[info]     Cause: java.net.UnknownHostException: nn1.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:70)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$6$$anon$3.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:71)
[info]     ...
[info]   - returns a webhdfs filesystem + path based on dfs.nameservices and spark.eventLog.dir when the latter is a path and the dfs.nameservices is configured and available *** FAILED ***
[info]     java.lang.IllegalArgumentException: java.net.UnknownHostException: sample-ha2.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:77)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$7$$anon$4.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:91)
[info]     ...
[info]     Cause: java.net.UnknownHostException: sample-ha2.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:77)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$7$$anon$4.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:91)
[info]     ...
[info]   - returns a webhdfs filesystem + path based on dfs.nameservices and spark.eventLog.dir when the latter is a path and the dfs.nameservices is configured but unavailable *** FAILED ***
[info]     java.lang.IllegalArgumentException: java.net.UnknownHostException: sample.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:77)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$8$$anon$5.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:117)
[info]     ...
[info]     Cause: java.net.UnknownHostException: sample.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:77)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$8$$anon$5.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:117)
[info]     ...
[info]   - returns a webhdfs filesystem + path based on dfs.namenode.http-address and spark.eventLog.dir when the latter is a path and dfs.nameservices is not configured *** FAILED ***
[info]     java.lang.IllegalArgumentException: java.net.UnknownHostException: sample.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:77)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$9$$anon$6.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:139)
[info]     ...
[info]     Cause: java.net.UnknownHostException: sample.grid.example.com
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
[info]     at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:395)
[info]     at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:178)
[info]     at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
[info]     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
[info]     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
[info]     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
[info]     at com.linkedin.drelephant.util.SparkUtils$class.fileSystemAndPathForEventLogDir(SparkUtils.scala:77)
[info]     at com.linkedin.drelephant.util.SparkUtilsTest$$anonfun$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$9$$anon$6.fileSystemAndPathForEventLogDir(SparkUtilsTest.scala:139)
[info]     ...
[info]   - throws an exception when spark.eventLog.dir is a path and no namenode is configured at all
[info]   .pathAndCodecforEventLog
[info]   - returns the path and codec for the event log, given the base path and app/attempt information
[info]   - returns the path and codec for the event log, given the base path and appid. Extracts attempt and codec from path
[info]   .withEventLog
[info]   - loans the input stream for the event log
[info] com.linkedin.drelephant.util.SparkUtilsTest
[info] x SparkUtils .fileSystemAndPathForEventLogDir returns a filesystem + path based on uri from fetcherConfg
[info] x SparkUtils .fileSystemAndPathForEventLogDir returns a webhdfs filesystem + path based on spark.eventLog.dir when it is a webhdfs URL
[info] x SparkUtils .fileSystemAndPathForEventLogDir returns a webhdfs filesystem + path based on spark.eventLog.dir when it is an hdfs URL
[info] x SparkUtils .fileSystemAndPathForEventLogDir returns a webhdfs filesystem + path based on dfs.nameservices and spark.eventLog.dir when the latter is a path and the dfs.nameservices is configured and available
[info] x SparkUtils .fileSystemAndPathForEventLogDir returns a webhdfs filesystem + path based on dfs.nameservices and spark.eventLog.dir when the latter is a path and the dfs.nameservices is configured but unavailable
[info] x SparkUtils .fileSystemAndPathForEventLogDir returns a webhdfs filesystem + path based on dfs.namenode.http-address and spark.eventLog.dir when the latter is a path and dfs.nameservices is not configured
[info] + SparkUtils .fileSystemAndPathForEventLogDir throws an exception when spark.eventLog.dir is a path and no namenode is configured at all
[info] + SparkUtils .pathAndCodecforEventLog returns the path and codec for the event log, given the base path and app/attempt information
[info] + SparkUtils .pathAndCodecforEventLog returns the path and codec for the event log, given the base path and appid. Extracts attempt and codec from path
[info] + SparkUtils .withEventLog loans the input stream for the event log
[info] 
[info] 
[info] Total for test com.linkedin.drelephant.util.SparkUtilsTest
[info] Finished in 0.007 seconds
[info] 10 tests, 6 failures, 0 errors

......

Tests: succeeded 141, failed 6, canceled 0, ignored 0, pending 0
[info] *** 6 TESTS FAILED ***
[error] Failed: Total 465, Failed 6, Errors 0, Passed 453, Skipped 6
[error] Failed tests:
[error]     com.linkedin.drelephant.util.SparkUtilsTest
[error] (test:test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 52 s, completed Sep 18, 2018 1:20:42 PM
+ cd target/universal
./compile.sh: line 152: cd: target/universal: No such file or directory
++ /bin/ls '*.zip'
/bin/ls: cannot access *.zip: No such file or directory
+ ZIP_NAME=
+ unzip
UnZip 6.00 of 20 April 2009, by Info-ZIP.  Maintained by C. Spieler.  Send
bug reports using http://www.info-zip.org/zip-bug.html; see README for details.

Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]
  Default action is to extract files in list, except those in xlist, to exdir;
  file[.zip] may be a wildcard.  -Z => ZipInfo mode ("unzip -Z" for usage).

  -p  extract files to pipe, no messages     -l  list files (short format)
  -f  freshen existing files, create none    -t  test compressed archive data
  -u  update files, create if necessary      -z  display archive comment only
  -v  list verbosely/show version info       -T  timestamp archive to latest
  -x  exclude files that follow (in xlist)   -d  extract files into exdir
modifiers:
  -n  never overwrite existing files         -q  quiet mode (-qq => quieter)
  -o  overwrite files WITHOUT prompting      -a  auto-convert any text files
  -j  junk paths (do not make directories)   -aa treat ALL files as text
  -U  use escapes for all non-ASCII Unicode  -UU ignore any Unicode fields
  -C  match filenames case-insensitively     -L  make (some) names lowercase
  -X  restore UID/GID info                   -V  retain VMS version numbers
  -K  keep setuid/setgid/tacky permissions   -M  pipe through "more" pager
  -O CHARSET  specify a character encoding for DOS, Windows and OS/2 archives
  -I CHARSET  specify a character encoding for UNIX and other archives

See "unzip -hh" or unzip.txt for more help.  Examples:
  unzip data1 -x joe   => extract all files except joe from zipfile data1.zip
  unzip -p foo | more  => send contents of foo.zip via pipe into program more
  unzip -fo foo ReadMe => quietly replace existing ReadMe if archive file newer
+ rm
rm: missing operand
Try 'rm --help' for more information.
+ DIST_NAME=
+ chmod +x /bin/dr-elephant
chmod: cannot access ‘/bin/dr-elephant’: No such file or directory
+ sed -i.bak '/declare -r app_classpath/s/.$/:`hadoop classpath`:${ELEPHANT_CONF_DIR}"/' /bin/dr-elephant
sed: can't read /bin/dr-elephant: No such file or directory
+ cp /home/hadoopuser/Work/dr-elephant/scripts/start.sh /bin/
cp: cannot create regular file ‘/bin/start.sh’: Permission denied
+ cp /home/hadoopuser/Work/dr-elephant/scripts/stop.sh /bin/
cp: cannot create regular file ‘/bin/stop.sh’: Permission denied
+ cp -r /home/hadoopuser/Work/dr-elephant/app-conf
cp: missing destination file operand after ‘/home/hadoopuser/Work/dr-elephant/app-conf’
Try 'cp --help' for more information.
+ mkdir /scripts/
mkdir: cannot create directory ‘/scripts/’: File exists
+ cp -r /home/hadoopuser/Work/dr-elephant/scripts/pso /scripts/
cp: cannot create regular file ‘/scripts/pso/pso_param_generation.py’: Permission denied
cp: cannot create regular file ‘/scripts/pso/restartable_pso.py’: Permission denied
+ zip -r .zip

zip error: Nothing to do! (.zip)
+ mv .zip /home/hadoopuser/Work/dr-elephant/dist/
mv: cannot stat ‘.zip’: No such file or directory

But then i tried running without compile.conf, compiling with default versions (hadoop - 2.3.0 and spark - 1.4.0), and it ran without any errors.

Is there a possibility that some latest versions of hadoop/spark are not supported? If not i think it should be mentioned in the README

omicron8 commented 5 years ago

As a workaround edit compile.sh and remove "test $extra_commands" in the following instructions: play_command $OPTS clean compile test $extra_commands

Should be: play_command $OPTS clean compile