apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.47k stars 1.01k forks source link

Stop changing the case for COPY TO option values #10853

Open tinfoil-knight opened 2 weeks ago

tinfoil-knight commented 2 weeks ago

Describe the bug

Currently, we're changing the case of every option value passed with the COPY statement:

https://github.com/apache/datafusion/blob/3773fb7fb54419f889e7d18b73e9eb48069eb08e/datafusion/sql/src/statement.rs#L891-L900

This causes an issue in cases where the option values for some external integrations might be case sensitive (eg. like access keys).

@xinlifoobar found this while working on adding an integration to huggingface in https://github.com/apache/datafusion/pull/10792.

See previous discussion here: https://github.com/apache/datafusion/pull/9723#discussion_r1632879391

To Reproduce

No response

Expected behavior

There should be some way to pass option values as is for certain integrations.

Additional context

No response

alamb commented 2 weeks ago

fyi @metesynnada

xinlifoobar commented 2 weeks ago

Sorry if I mislead you. I encountered this issue in the CREATE EXTERNAL TABLE statement as

https://github.com/apache/datafusion/blob/3773fb7fb54419f889e7d18b73e9eb48069eb08e/datafusion/sql/src/statement.rs#L1017-L1025

How has such an issue typically been resolved previously? If adding a ParseOptions, e.g., enable_options_value_normalization, is a reasonable way to fix it, I could help there.

https://github.com/apache/datafusion/blob/47026a2a3dd41a5c87e44ade58d91a89feba147b/datafusion/sql/src/planner.rs#L97-L100

berkaysynnada commented 2 weeks ago

I believe we can remove the lowercase conversion here and instead add it when we set ConfigField's (via alter_with_string_hash_map) for the cases where we can ignore case sensitivity.