databricks / spark-avro

Avro Data Source for Apache Spark
http://databricks.com/
Apache License 2.0
539 stars 310 forks source link

Do not ignore avro files without extensions by default #286

Closed MaxGekk closed 6 years ago

MaxGekk commented 6 years ago

What changes were proposed in this pull request?

In the PR, I propose to change the default behaviour of AVRO datasource which currently ignores files without .avro extension in read by default. This PR sets the default value for avro.mapred.ignore.inputs.without.extension to false in the case if the parameter is not set by an user.

How was this patch tested?

Added a test file without extension in AVRO format, and new test for reading the file with and wihout specified schema.

codecov-io commented 6 years ago

Codecov Report

Merging #286 into master will increase coverage by 0.33%. The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #286      +/-   ##
==========================================
+ Coverage   92.23%   92.56%   +0.33%     
==========================================
  Files           5        5              
  Lines         322      323       +1     
  Branches       43       41       -2     
==========================================
+ Hits          297      299       +2     
+ Misses         25       24       -1
gengliangwang commented 6 years ago

Left some comments in https://github.com/apache/spark/pull/21769 .

MaxGekk commented 6 years ago

@gengliangwang I ported the commit https://github.com/apache/spark/commit/ba437fc5c73b95ee4c59327abf3161c58f64cb12 . Please, take a look at the PR again.

gengliangwang commented 6 years ago

Thanks, merge to master.