martinprobson / vagrant-hadoop-hive-spark

Vagrant project to spin up a single node VM running current versions of Hadoop, Hive and Spark
Apache License 2.0
68 stars 57 forks source link

File does not exist: hdfs://node1:8020/user/tez/share/tez.tar.gz #5

Closed YANGLLI closed 5 years ago

YANGLLI commented 6 years ago

When I try to run hadoop jar wordcount, it reports such issue.

martinprobson commented 6 years ago

Hi YANGLLI,

1) There should be the tez jars in HDFS under /user/tez can you do: -

hdfs dfs -ls /user/tez 

in the VM and check? You should see a whole bunch of tez jars in that location. 2) In the meantime, as a workaround, can you change the config entry in the file /usr/local/hadoop-2.7.3/etc/hadoop/mapred-site.xml in the VM - From

    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn-tez</value> 
    </property>

to

    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

Then restart yarn/hdfs (or vagrant halt/vagrant up the VM) and re-run wordcount. Let me know the results please. Regards, Martin

martinprobson commented 6 years ago

I just rebuilt my box and ran a query on Tez with no issues: -

vagrant@node1:~$ hive
ls: cannot access '/usr/local/spark/lib/spark-assembly-*.jar': No such file or directory

Logging initialized using configuration in jar:file:/usr/local/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/apache-tez-0.8.5-bin/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
hive> set hive.execution.engine=tez;
hive> insert into foo values ("goo");
Query ID = vagrant_20180517171523_833944e9-037d-4ec4-8409-167e085928ef
Total jobs = 1
Launching Job 1 out of 1

Status: Running (Executing on YARN cluster with App id application_1526575764311_0002)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 4.81 s     
--------------------------------------------------------------------------------
Loading data to table default.foo
Table default.foo stats: [numFiles=2, numRows=2, totalSize=8, rawDataSize=6]
OK
Time taken: 16.67 seconds
hive> 

Can you please check and see if this is still an issue for you?

YANGLLI commented 6 years ago

Yeah, this issue seems solved with your first solution. Thanks!

martinprobson commented 5 years ago

np

martinprobson commented 5 years ago

closed