Open raymanrt opened 13 years ago
This is not stupid at all and probably a real bug. I am not had the time to run it on pig 0.9.x yet and it seems that the schema handling has changed a bit between 0.8 and 0.9, see for instance this message:
I would suggest you to try on pig 0.8 in the mean time. Leave this issue open while I find the time to fix the schema to make it runnable on 0.9 as well.
subscribe (in the meantime, thanks for pig 0.8 workaround)
BTW, i would be pleased to merge a pull request if you can make it work on more recent versions of pig.
@renaud @raymanrt could try to see if @maxjakob fixes (now merged in master) your issues?
I have a problem with the execution of the following script
../../../pig-0.9.0/bin/pig -x local -p PIGNLPROC_JAR=target/pignlproc-0.1.0-SNAPSHOT.jar -p LANG=it -p INPUT=wikipedia-xml-chunks/chunk-0001.xml -p OUTPUT=workspace examples/ner-corpus/01_extract_sentences_with_links.pig
where the paths are right and the exeption generated is:
2011-08-30 17:47:38,867 [main] INFO org.apache.pig.Main - Logging error messages to: /home/rayman/workspace/pignlproc/pignlproc/pig_1314719258864.log 2011-08-30 17:47:39,006 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// 2011-08-30 17:47:39,376 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId= 2011-08-30 17:47:39,391 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2218: Invalid resource schema: bag schema must have tuple as its field Details at logfile: /home/rayman/workspace/pignlproc/pignlproc/pig_1314719258864.log
The log file reports:
Pig Stack Trace
ERROR 2218: Invalid resource schema: bag schema must have tuple as its field
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Invalid resource schema: bag schema must have tuple as its field at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1652) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1597) at org.apache.pig.PigServer.registerQuery(PigServer.java:583) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:553) at org.apache.pig.Main.main(Main.java:108) Caused by: Failed to parse: Pig script failed to parse: <file examples/ner-corpus/01_extract_sentences_with_links.pig, line 20, column 30> Failed to generate logical plan. Nested exception: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: <file examples/ner-corpus/01_extract_sentences_with_links.pig, line 15, column 9> Cannot get schema from loadFunc pignlproc.storage.ParsingWikipediaLoader at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:178) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1644) ... 9 more Caused by: <file examples/ner-corpus/01_extract_sentences_with_links.pig, line 20, column 30> Failed to generate logical plan. Nested exception: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: <file examples/ner-corpus/01_extract_sentences_with_links.pig, line 15, column 9> Cannot get schema from loadFunc pignlproc.storage.ParsingWikipediaLoader at org.apache.pig.parser.LogicalPlanGenerator.alias_col_ref(LogicalPlanGenerator.java:12992) at org.apache.pig.parser.LogicalPlanGenerator.col_ref(LogicalPlanGenerator.java:12854) at org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:7789) at org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:7549) at org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:6959) at org.apache.pig.parser.LogicalPlanGenerator.cond(LogicalPlanGenerator.java:5894) at org.apache.pig.parser.LogicalPlanGenerator.filter_clause(LogicalPlanGenerator.java:5556) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1062) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:638) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:459) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:357) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:171) ... 10 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2245: <file examples/ner-corpus/01_extract_sentences_with_links.pig, line 15, column 9> Cannot get schema from loadFunc pignlproc.storage.ParsingWikipediaLoader at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:154) at org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109) at org.apache.pig.parser.LogicalPlanGenerator.alias_col_ref(LogicalPlanGenerator.java:12990) ... 21 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2218: Invalid resource schema: bag schema must have tuple as its field at org.apache.pig.ResourceSchema$ResourceFieldSchema.throwInvalidSchemaException(ResourceSchema.java:213) at org.apache.pig.impl.logicalLayer.schema.Schema.getPigSchema(Schema.java:1887) at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
... 23 more
Unfortunately I don't know anything about pig scripting, and my question may appear a bit stupid. Any help would be appreciated.
Thanks, Riccardo