prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.97k stars 5.35k forks source link

Always return NULL for columns with uppercase letters #4215

Closed soulmachine closed 5 years ago

soulmachine commented 8 years ago

I have a Hive external table based on Parquet files on S3, the table schema is:

CREATE EXTERNAL TABLE IF NOT EXISTS app_logs(
class STRING,
exception_class STRING,
flume_node STRING,
host STRING,
loglevel STRING,
microseconds BIGINT,
parser STRING,
path STRING,
smsdelivery_AccountSid STRING,
smsdelivery_AnsweredBy STRING,
smsdelivery_ApiVersion STRING,
smsdelivery_CallSid STRING,
smsdelivery_CallbackSource STRING,
smsdelivery_Called BIGINT,
smsdelivery_CalledCountry STRING,
smsdelivery_Caller BIGINT,
smsdelivery_CallerCity STRING,
smsdelivery_CallerCountry STRING,
smsdelivery_CallerState STRING,
smsdelivery_CallerZip BIGINT,
smsdelivery_Direction STRING,
smsdelivery_Duration BIGINT,
smsdelivery_From BIGINT,
smsdelivery_FromCity STRING,
smsdelivery_FromCountry STRING,
smsdelivery_FromState STRING,
smsdelivery_FromZip BIGINT,
smsdelivery_SequenceNumber BIGINT,
smsdelivery_SipCallId STRING,
smsdelivery_SipResponseCode BIGINT,
smsdelivery_Timestamp STRING,
smsdelivery_ToCountry STRING,
smsdelivery_altSpn BIGINT,
smsdelivery_attemptNumber BIGINT,
smsdelivery_callDuration BIGINT,
smsdelivery_callStatus STRING,
smsdelivery_class_message STRING,
smsdelivery_countryCode STRING,
smsdelivery_doneDate STRING,
smsdelivery_errorMessage STRING,
smsdelivery_errorReason STRING,
smsdelivery_hlrCountryCode STRING,
smsdelivery_hlrMCCMNC STRING,
smsdelivery_language STRING,
smsdelivery_locale STRING,
smsdelivery_mccmnc STRING,
smsdelivery_messageContent STRING,
smsdelivery_messageId STRING,
smsdelivery_msgType STRING,
smsdelivery_networkName STRING,
smsdelivery_nexmo_call_id STRING,
smsdelivery_nexmo_caller_id BIGINT,
smsdelivery_phoneNumber BIGINT,
smsdelivery_phoneType STRING,
smsdelivery_phonesCount BIGINT,
smsdelivery_platformType STRING,
smsdelivery_price DOUBLE,
smsdelivery_provider STRING,
smsdelivery_providerName STRING,
smsdelivery_providerNumber BIGINT,
smsdelivery_queryStatus BIGINT,
smsdelivery_session_accountid BIGINT,
smsdelivery_session_calledid STRING,
smsdelivery_session_callerid STRING,
smsdelivery_session_parentsessionid STRING,
smsdelivery_session_sessionid STRING,
smsdelivery_session_virtualplatform STRING,
smsdelivery_skippedPhonesCount BIGINT,
smsdelivery_spn BIGINT,
smsdelivery_spnPhoneType STRING,
smsdelivery_state STRING,
smsdelivery_status STRING,
smsdelivery_statusCode STRING,
smsdelivery_submitDate STRING,
smsdelivery_userName STRING,
smsdelivery_validationCode BIGINT,
thread STRING,
time_since_start BIGINT,
`timestamp` BIGINT,
type STRING
)
PARTITIONED BY (year INT, month INT, day INT, hour INT)
STORED AS PARQUET
LOCATION 's3a://chef-logstash-app-structured/new/';

My query is :

select smsdelivery_countryCode,smsdelivery_phoneType from app_logs where year=2015 and month=12 and day=20 and smsdelivery_countryCode='US' limit 10;

This query can get some rows if run in Hive CLI, but it returns zero rows if run in Presto CLI.

I've set hive.parquet.use-column-names=true in file etc/catalog/hive.properties.

I guess my problem is caused by this issue Add support for case sensitive identifiers #2863

Am I correct ?

saileshmittal commented 8 years ago

@soulmachine Yes. We saw the same problem and had to disable hive.parquet.use-column-names to make it work.

Another case-sensitive issue, which does not work irrespective of above flag is with nested structs. Consider abc STRUCT<withCaps:STRING, without_caps:STRING> as part of a table. Accessing abc.withCaps from Presto always generates SemanticException since it can not resolve withcaps field.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.