AVRO schemas can be created with different string types, ie. 'UTF8' and 'String'.
Using "avro.java.string": "String" resulted in the following exception:
Error: cascading.tuple.TupleException: unable to read from input identifier: hdfs://ns1/user/perplexa/log/hourly/2014/08/30/12/log.0.11.989586.7779487783.1409400000000.avro
at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:127)
at cascading.flow.stream.SourceStage.map(SourceStage.java:76)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:127)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.util.Utf8
at cascading.avro.AvroToCascading.fromAvroMap(AvroToCascading.java:109)
at cascading.avro.AvroToCascading.fromAvro(AvroToCascading.java:83)
at cascading.avro.AvroToCascading.fromAvroUnion(AvroToCascading.java:140)
at cascading.avro.AvroToCascading.fromAvro(AvroToCascading.java:62)
at cascading.avro.AvroToCascading.parseRecord(AvroToCascading.java:49)
at cascading.avro.AvroScheme.source(AvroScheme.java:254)
at cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:140)
at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:120)
... 10 more
This patch fixed the issue and makes the code work with both, UTF8 and String types.
AVRO schemas can be created with different string types, ie. 'UTF8' and 'String'. Using "avro.java.string": "String" resulted in the following exception:
This patch fixed the issue and makes the code work with both, UTF8 and String types.