apache / drill

Apache Drill is a distributed MPP query layer for self describing data
https://drill.apache.org/
Apache License 2.0
1.93k stars 980 forks source link

DRILL-8450: Add Data Type Inference to XML Format Plugin #2819

Closed cgivre closed 1 year ago

cgivre commented 1 year ago

DRILL-8450: Add Data Type Inference to XML Format Plugin

Description

This PR adds data type inference to the XML format plugin. In similar fashion to other plugins, it adds a new configuration parameter: allTextMode, which when set to true, reads all data as strings. The default is true. Note that the inference is limited to doubles, date, timestamps, boolean and strings.

Documentation

Updated README

Testing

Added unit test.

cgivre commented 1 year ago

Converting to draft. There's a unit test failing in the HTTP plugin.

cgivre commented 1 year ago

@mbeckerle Unit tests fixed. I also added the data type inference for APIs that generate XML.
@jnturton, The CI is still failing with that Kerberos issue.

cgivre commented 1 year ago

@mbeckerle Could you please take another look. I had to fix a few things for a unit test. Thx!

cgivre commented 1 year ago

@mbeckerle @jnturton Are we ok to merge this? I'll add support for arrays in a separate PR.

jnturton commented 1 year ago

LGTM