apache / parquet-java

Apache Parquet Java
https://parquet.apache.org/
Apache License 2.0
2.63k stars 1.41k forks source link

Avoid evoking job.toString() in ParquetLoader #1942

Closed asfimport closed 8 years ago

asfimport commented 8 years ago

When ran under hadoop2 environment and log level setting to DEBUG, ParquetLoader would evoke job.toString() in several methods, which might cause the whole application to stop due to :

java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING

at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:283) at org.apache.hadoop.mapreduce.Job.toString(Job.java:452) at java.lang.String.valueOf(String.java:2847) at java.lang.StringBuilder.append(StringBuilder.java:128) at org.apache.parquet.pig.ParquetLoader.getSchema(ParquetLoader.java:260) at org.apache.parquet.pig.TestParquetLoader.testSchema(TestParquetLoader.java:54) ...

The reason is that in the hadoop 2.x branch, org.apache.hadoop.mapreduce.Job.toString() has added an ensureState(JobState.RUNNING) check; see map-reduce: Job.java#452. In contrast, the hadoop 1.x branch does not contain such checks, so ParquetLoader works well.

This ticket simply avoids evoking job.toString() in ParquetLoader.

Reporter: Liwei Lin(Inactive) / @lw-lin Assignee: Liwei Lin(Inactive) / @lw-lin

Related issues:

Note: This issue was originally created as PARQUET-529. Please see the migration documentation for further details.

asfimport commented 8 years ago

Ryan Blue / @rdblue: I committed the fix. Thanks, @lw-lin!