In AvroSerDe, hive.output.file.extension is forced to ".avro".
if(configuration == null) {
LOG.info("Configuration null, not inserting schema");
} else {
// force output files to have a .avro extension
configuration.set("hive.output.file.extension", ".avro");
configuration.set(HAIVVREO_SCHEMA, schema.toString(false));
}
If I query an Avro backed table or join an Avro backed table with a non-avro table, the result is in TextInputFormat and uses LazySimpleSerDe. This change usually won't cause problem until you set hive.exec.compress.out=true because TextInputFormat uses extensions to figure out the compression codec, and treat .avro as a plain text, but the file is deflate or snappy compressed.
Load some data into the table, then run the following commands:
hive> set hive.output.file.extension=false;
hive> select user_name, age from haivvreo_players;
....
OK
john 34
ben 15
jean 17
Time taken: 8.886 seconds
hive> set hive.output.file.extension=true;
hive> select user_name, age from haivvreo_players;
...
OK
x� NULL
x�����c46�JJ�c44��JM��\L�� NULL
Time taken: 9.247 seconds
In AvroSerDe, hive.output.file.extension is forced to ".avro".
If I query an Avro backed table or join an Avro backed table with a non-avro table, the result is in TextInputFormat and uses LazySimpleSerDe. This change usually won't cause problem until you set hive.exec.compress.out=true because TextInputFormat uses extensions to figure out the compression codec, and treat .avro as a plain text, but the file is deflate or snappy compressed.
You can reproduce like this:
Load some data into the table, then run the following commands: