2021-11-22 12:21:08,011 INFO [main] org.apache.hadoop.mapred.MapTask: Ignoring exception during close for org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader@7a7471ce
java.lang.NullPointerException
at uk.bl.wa.hadoop.mapreduce.lib.input.ByteBlockRecordReader.close(ByteBlockRecordReader.java:52)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:536)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2075)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:809)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
2021-11-22 12:21:08,012 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output
2021-11-22 12:21:08,019 INFO [main] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2021-11-22 12:21:08,019 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate]
2021-11-22 12:21:08,027 ERROR [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
at uk.bl.wa.hadoop.mapreduce.lib.input.ByteBlockRecordReader.initialize(ByteBlockRecordReader.java:77)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:561)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Presumably, given the indexer runs fine, the simplest approach is to port this to the older org.apache.hadoop.mapred API, rather than the org.apache.hadoop.mapreduce API.
Trying to use the
HdfsFileHasher
and seeing:Presumably, given the indexer runs fine, the simplest approach is to port this to the older
org.apache.hadoop.mapred
API, rather than theorg.apache.hadoop.mapreduce
API.