zalando-incubator / spark-json-schema

JSON schema parser for Apache Spark
MIT License
81 stars 43 forks source link

Add types #49

Open hesserp opened 3 years ago

hesserp commented 3 years ago

Add support for decimal type #43 and timestamp type #37

decimal type has optional parameters precision and range (spark default are 10 and 0). example:

"decimal_field": {
  "type": "decimal",
  "precision": 38,
  "range": 18
}

when using a timestamp field you may provide a format on loading the data, e.g.

sparkSession.read.schema(schema).options(Map("timestampFormat" -> "yyyy-MM-dd HH:mm:ss")).json("path/to/data")
codecov[bot] commented 3 years ago

Codecov Report

Merging #49 (08cae3c) into master (d5d8677) will increase coverage by 1.50%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #49      +/-   ##
==========================================
+ Coverage   91.56%   93.06%   +1.50%     
==========================================
  Files           1        1              
  Lines          83      101      +18     
  Branches        1        1              
==========================================
+ Hits           76       94      +18     
  Misses          7        7              
Impacted Files Coverage Δ
...org/zalando/spark/jsonschema/SchemaConverter.scala 93.06% <100.00%> (+1.50%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d5d8677...08cae3c. Read the comment docs.

zzeekk commented 2 years ago

Support of timestamp and decimal datatypes would be very interesting. Is there a plan to merge this PR?