trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
9.83k stars 2.85k forks source link

Problem with case sensitivity in hive.system.sync_partition_metadata #22180

Open mosabua opened 1 month ago

mosabua commented 1 month ago

As reported by @dprophet

The commit https://github.com/trinodb/trino/pull/18241/files#diff-964e30ffd5f7093745c2f198f2688af86d7c2677fcb7f406a75e4c9a59492991L172-L178

Broke the procedure hive.system.sync_partition_metadata

https://trino.io/docs/current/connector/hive.html#procedures

Specifically the line

https://github.com/trinodb/trino/blob/master/plugin/trino-hive/src/main/java/io/trino/plugin/hive/procedure/SyncPartitionMetadataProcedure.java#L148

Doesnt deal with the case sensitivity of the data on S3 vs the lower case names stored in the HIVE metastore

mosabua commented 1 month ago

I mentioned this to @electrum as the file system lead and author of the offending commit.

dprophet commented 1 month ago

This is a screen capture of the debugger for the original 422 behavior

hive_sync_case_sensitive_422

Line of code in question is

https://github.com/trinodb/trino/blob/a258399b8433c9bf03f96033e8d296c74fed6a57/plugin/trino-hive/src/main/java/io/trino/plugin/hive/procedure/SyncPartitionMetadataProcedure.java#L157