hbutani / SQLWindowing

SQL Windowing Functions for Hadoop
65 stars 17 forks source link

Parse Error: line 1:15 required (...)+ loop did not match anything at input 'partition' in statement #13

Closed panfei closed 12 years ago

panfei commented 12 years ago

hi guys, what may cause this problem ? thank you .

select * from windowing; OK 1 cs 1000.0 2 cs 2000.0 3 cs 4000.0 4 ds 3000.0 5 ds 1000.0 6 ds 500.0 7 cs 8000.0 Time taken: 0.731 seconds hive> from windowing partition by dep order by salary with count() as c select dep, salary, c ; FAILED: Parse Error: line 2:0 required (...)+ loop did not match anything at input 'partition' in statement

panfei commented 12 years ago

[root@test00 ~]# hive --service windowingCli -w /tmp/com.sap.hadoop.windowing-0.0.1-SNAPSHOT-jar-with-dependencies.jar /usr/lib/hadoop/bin/hadoop Hive history file=/tmp/root/hive_job_log_root_201203271754_343497963.txt hive-log4j.properties not found hive> desc test;
OK id int dep string salary float Time taken: 2.824 seconds hive> select * from test; OK 1 cs 1000.0 2 cs 2000.0 3 cs 4000.0 4 ds 3000.0 5 ds 1000.0 6 ds 500.0 7 cs 8000.0 Time taken: 0.482 seconds hive> wmode windowing; hive> from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r; 12/03/27 17:55:37 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 12/03/27 17:55:37 INFO metastore.ObjectStore: ObjectStore, initialize called Failed windowing query from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r com.sap.hadoop.windowing.WindowingException: javax.jdo.JDOFatalInternalException: Unexpected exception caught. NestedThrowables: java.lang.reflect.InvocationTargetException hive>

hbutani commented 12 years ago

Hi,

regards, Harish.

panfei commented 12 years ago

hi hbutani, thank you for your reply, I start the hive metastore server as you said, and it really changed something, but there seems I encounter some other issues :

[root@test00 ~]# hive --service windowingCli -w /tmp/com.sap.hadoop.windowing-0.0.1-SNAPSHOT-jar-with-dependencies.jar /usr/lib/hadoop/bin/hadoop Hive history file=/tmp/root/hive_job_log_root_201203281655_2044112710.txt hive> hive-log4j.properties not found

> desc test;

OK id int dep string salary float Time taken: 0.602 seconds hive> select * from test; OK 1 cs 1000.0 2 cs 2000.0 3 cs 4000.0 4 ds 3000.0 5 ds 1000.0 6 ds 500.0 7 cs 8000.0 Time taken: 0.55 seconds hive> wmode windowing; hive> from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r; 12/03/28 16:55:47 INFO hive.metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 12/03/28 16:55:47 INFO hive.metastore: Connected to metastore. 12/03/28 16:55:47 INFO hive.log: DDL: struct test { i32 id, string dep, float salary} Failed windowing query from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r java.lang.NullPointerException hive>

panfei commented 12 years ago

this is stack trace :

2012-03-28 17:31:10,748 ERROR CliDriver (SessionState.java:printError(365)) - Failed windowing query from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r java.lang.NullPointerExceptioncom.sap.hadoop.windowing.WindowingException: java.lang.NullPointerException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.codehaus.groovy.reflection.CachedConstructor.invoke(CachedConstructor.java:77) at org.codehaus.groovy.runtime.callsite.ConstructorSite$ConstructorSiteNoUnwrapNoCoerce.callConstructor(ConstructorSite.java:102) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:190) at com.sap.hadoop.windowing.cli.WindowingClient.executeQuery(WindowingClient.groovy:61) at com.sap.hadoop.windowing.cli.WindowingClient$executeQuery.call(Unknown Source) at com.sap.hadoop.windowing.WindowingHiveCliDriver.processCmd(WindowingHiveCliDriver.groovy:132) at com.sap.hadoop.windowing.WindowingHiveCliDriver$processCmd.callCurrent(Unknown Source) at com.sap.hadoop.windowing.WindowingHiveCliDriver.processLine(WindowingHiveCliDriver.groovy:85) at com.sap.hadoop.windowing.WindowingHiveCliDriver$processLine.call(Unknown Source) at com.sap.hadoop.windowing.WindowingHiveCliDriver.main(WindowingHiveCliDriver.groovy:255) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197)

hbutani commented 12 years ago

Hi,

Thanks for testing. The issue in the previous query was you didn't have an output clause. So you need to say something like: from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r into path='/tmp/wout';

The output of the query needs to be a hdfs path or an existing hive table (or partition).

But you should have gotten a better error message. Have added a fix for this. It is checked in to the src tree. In the middle of a major enhancement, will publish another jar file in a few days.

regards, Harish.

panfei commented 12 years ago

hi hbutani, very thank you for your help , it almost works ! but there seems to be a hadoop-lzo classpath problem, I configured hadoop lzo when I setup the test cluster and it works well with lzo compressed files, but the data in the test table is plain text without any compression, so I think it should be some type of classpath problem, what I want to know is where to set the classpath for the windowing extension ? I set groovy classpath in the hadoop-env.sh and it works, but I set the hadoop-lzo.jar in hadoop-env.sh does not works . thank you

[root@test00 ~]# hive --service windowingCli -w /tmp/com.sap.hadoop.windowing-0.0.1-SNAPSHOT-jar-with-dependencies.jar Hive history file=/tmp/root/hive_job_log_root_201203291034_1297189402.txt hive-log4j.properties not found hive> wmode windowing; hive> from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r

into path='/tmp/wout'; 12/03/29 10:34:51 INFO hive.metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 12/03/29 10:34:51 INFO hive.metastore: Connected to metastore. 12/03/29 10:34:52 INFO hive.log: DDL: struct test { i32 id, string dep, float salary} 12/03/29 10:34:52 INFO hive.metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083 12/03/29 10:34:52 INFO hive.metastore: Connected to metastore. Failed windowing query from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r into path='/tmp/wout' com.sap.hadoop.windowing.WindowingException: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. hive>

hbutani commented 12 years ago

So I think this is what is going on:

So can you try the following: hive --service windowingCli -w /tmp/com.sap.hadoop.windowing-0.0.1-SNAPSHOT-jar-with-dependencies.jar:

for e.g. hive --service windowingCli -w /tmp/com.sap.hadoop.windowing-0.0.1-SNAPSHOT-jar-with-dependencies.jar:$HADOOP_HOME/lib/hadoop-lzo-lib.jar

This will spawn the windowing server process with the Lzo lib in the classpath

panfei commented 12 years ago

hi hbutani, it really works, but there are some other exceptions about lzo:

> from test partition by dep order by dep, salary desc with rank() as r select dep, id, salary, r
> into path='/tmp/wout';

12/03/29 12:32:27 INFO hive.metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083 12/03/29 12:32:27 INFO hive.metastore: Connected to metastore. 12/03/29 12:32:27 INFO hive.log: DDL: struct test { i32 id, string dep, float salary} 12/03/29 12:32:27 INFO hive.metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083 12/03/29 12:32:27 INFO hive.metastore: Connected to metastore. 12/03/29 12:32:27 INFO mapred.FileInputFormat: Total input paths to process : 1 12/03/29 12:32:27 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 12/03/29 12:32:28 INFO mapred.FileInputFormat: Total input paths to process : 1 12/03/29 12:32:28 INFO mapred.JobClient: Running job: job_201203291008_0004 12/03/29 12:32:29 INFO mapred.JobClient: map 0% reduce 0% 12/03/29 12:32:36 INFO mapred.JobClient: map 50% reduce 0% 12/03/29 12:32:43 INFO mapred.JobClient: Task Id : attempt_201203291008_0004_m_000000_0, Status : FAILED java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:387) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.jav 12/03/29 12:32:46 INFO mapred.JobClient: map 100% reduce 0% 12/03/29 12:32:55 INFO mapred.JobClient: map 100% reduce 33% 12/03/29 12:32:58 INFO mapred.JobClient: map 100% reduce 66% 12/03/29 12:33:02 INFO mapred.JobClient: Task Id : attempt_201203291008_0004_r_000000_0, Status : FAILED 12/03/29 12:33:03 INFO mapred.JobClient: map 100% reduce 0% 12/03/29 12:33:19 INFO mapred.JobClient: map 100% reduce 100% 12/03/29 12:33:19 INFO mapred.JobClient: Job complete: job_201203291008_0004 12/03/29 12:33:19 INFO mapred.JobClient: Counters: 27 12/03/29 12:33:19 INFO mapred.JobClient: Job Counters 12/03/29 12:33:19 INFO mapred.JobClient: Launched reduce tasks=2 hive> 12/03/29 12:33:19 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=18974 12/03/29 12:33:19 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/03/29 12:33:19 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/03/29 12:33:19 INFO mapred.JobClient: Launched map tasks=3 12/03/29 12:33:19 INFO mapred.JobClient: Data-local map tasks=3 12/03/29 12:33:19 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=30418 12/03/29 12:33:19 INFO mapred.JobClient: FileSystemCounters 12/03/29 12:33:19 INFO mapred.JobClient: FILE_BYTES_READ=122 12/03/29 12:33:19 INFO mapred.JobClient: HDFS_BYTES_READ=336 12/03/29 12:33:19 INFO mapred.JobClient: FILE_BYTES_WRITTEN=289919 12/03/29 12:33:19 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=97 12/03/29 12:33:19 INFO mapred.JobClient: Map-Reduce Framework 12/03/29 12:33:19 INFO mapred.JobClient: Map input records=7 12/03/29 12:33:19 INFO mapred.JobClient: Reduce shuffle bytes=163 12/03/29 12:33:19 INFO mapred.JobClient: Spilled Records=14 12/03/29 12:33:19 INFO mapred.JobClient: Map output bytes=132 12/03/29 12:33:19 INFO mapred.JobClient: CPU time spent (ms)=7410 12/03/29 12:33:19 INFO mapred.JobClient: Total committed heap usage (bytes)=587857920 12/03/29 12:33:19 INFO mapred.JobClient: Map input bytes=83 12/03/29 12:33:19 INFO mapred.JobClient: Combine input records=0 12/03/29 12:33:19 INFO mapred.JobClient: SPLIT_RAW_BYTES=210 12/03/29 12:33:19 INFO mapred.JobClient: Reduce input records=7 12/03/29 12:33:19 INFO mapred.JobClient: Reduce input groups=2 12/03/29 12:33:19 INFO mapred.JobClient: Combine output records=0 12/03/29 12:33:19 INFO mapred.JobClient: Physical memory (bytes) snapshot=585056256 12/03/29 12:33:19 INFO mapred.JobClient: Reduce output records=7 12/03/29 12:33:19 INFO mapred.JobClient: Virtual memory (bytes) snapshot=7643238400 12/03/29 12:33:19 INFO mapred.JobClient: Map output records=7

hbutani commented 12 years ago

looks like the first map attempt failed; but succeeded on the next attempt. Doesn't look like a windowing issue. Can you try out more egs.

panfei commented 12 years ago

OK, thank you for your help !