Closed asfimport closed 8 years ago
Xiang Li: Update parquet-pig/src/main/java/parquet/pig/summary/Summary.java to yield a more clear stack trace.
java.lang.NullPointerException at parquet.pig.summary.Summary.setInputSchema(Summary.java:261) at org.apache.pig.newplan.logical.expression.ExpToPhyTranslationVisitor.visit(ExpToPhyTranslationVisitor.java:512) at org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:113) at org.apache.pig.newplan.ReverseDependencyOrderWalkerWOSeenChk.walk(ReverseDependencyOrderWalkerWOSeenChk.java:69) at org.apache.pig.newplan.logical.relational.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:807) at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:87) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:260) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:295) at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) at org.apache.pig.PigServer.execute(PigServer.java:1364) at org.apache.pig.PigServer.access$500(PigServer.java:113) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1689) at org.apache.pig.PigServer.registerQuery(PigServer.java:623) at org.apache.pig.PigServer.registerQuery(PigServer.java:636) at parquet.pig.summary.TestSummary.testPigScript(TestSummary.java:139
Xiang Li: In Pig code, src/org/apache/pig/EvalFunc.java. A private number "inputSchemaInternal" represent the schema. Setter and Getter are also provided
316 private Schema inputSchemaInternal=null;
328 /**
329 * This method is for internal use. It is called by Pig core in both front-end
330 * and back-end to setup the right input schema for EvalFunc
331 */
332 public void setInputSchema(Schema input){
333 this.inputSchemaInternal=input;
334 }
335
336 /**
337 * This method is intended to be called by the user in {@link EvalFunc} to get the input
338 * schema of the EvalFunc
339 */
340 public Schema getInputSchema(){
341 return this.inputSchemaInternal;
342 }
But actually, they are overrided. In parquet-mr/parquet-pig/src/main/java/parquet/pig/summary/Summary.java, It uses a new number called inputSchema(vs. inputSchemaInternal) to represent schema and override setInputSchema(), but not override getInputSchema()
51 public class Summary extends EvalFunc<String> implements Algebraic {
54 private Schema inputSchema;
257 @Override
258 public void setInputSchema(Schema input) {
259 try {
260 // relation.bag.tuple
261 this.inputSchema=input.getField(0).schema.getField(0).schema;
262 saveSchemaToUDFContext();
263 } catch (FrontendException e) {
264 throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from " + input, e);
265 } catch (RuntimeException e) {
266 throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from "+input, e);
267 }
268 }
Daniel Dai / @daijyc: Input Schema is maintained by Pig inside EvalFunc. No need to maintain this in Parquet side. Attach patch.
Xiang Li: Thanks Daniel for taking care of this! +1 for the patch, more reasonable to fix it on Parquet side. UT passed on Parquet 1.8.0
Hi Julien, could you please give a review?
Thomas Friedrich / @tfriedr: I updated the patch from Daniel and removed the private inputSchema variable and instead call the getInputSchema method of the parent class. Otherwise inputSchema was always null. @julienledem, can you please review my pull-request with the patch.
Julien Le Dem / @julienledem: Issue resolved by pull request 292 https://github.com/apache/parquet-mr/pull/292
Thomas Friedrich / @tfriedr: Thanks, @julienledem. Shouldn't the fix version be a parquet-mr release, not parquet-format?
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias B at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1694) at org.apache.pig.PigServer.registerQuery(PigServer.java:623) at org.apache.pig.PigServer.registerQuery(PigServer.java:636) at parquet.pig.summary.TestSummary.testMaxIsZero(TestSummary.java:154) ... Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.RuntimeException: Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from null at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:307) at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) at org.apache.pig.PigServer.execute(PigServer.java:1364) at org.apache.pig.PigServer.access$500(PigServer.java:113) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1689) ... 32 more Caused by: java.lang.RuntimeException: Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from null at parquet.pig.summary.Summary.setInputSchema(Summary.java:266) at org.apache.pig.newplan.logical.expression.ExpToPhyTranslationVisitor.visit(ExpToPhyTranslationVisitor.java:530) at org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:132) at org.apache.pig.newplan.ReverseDependencyOrderWalkerWOSeenChk.walk(ReverseDependencyOrderWalkerWOSeenChk.java:69) at org.apache.pig.newplan.logical.relational.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:808) at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:87) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:258) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293) ... 37 more Caused by: java.lang.NullPointerException at parquet.pig.summary.Summary.setInputSchema(Summary.java:261) ... 46 more
It relates to a change on pig side: pig/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java introduced by PIG-3294
Reporter: Xiang Li Assignee: Thomas Friedrich / @tfriedr
Related issues:
Original Issue Attachments:
PRs and other links:
Note: This issue was originally created as PARQUET-334. Please see the migration documentation for further details.