trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.22k stars 2.95k forks source link

Required field 'uncompressed_page_size' was not found in serialized data! Struct: #2256

Closed sib19 closed 2 weeks ago

sib19 commented 4 years ago

Hi please refer below error when accessing parquet kindly help . Sample file i have enclosed from haggle download and tested same below error is appearing in presto cli.

I am using 326 version ...

io.prestosql.spi.PrestoException: can not read class org.apache.parquet.format.PageHeader: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@33fc99f6
        at io.prestosql.plugin.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:167)
        at io.prestosql.spi.block.LazyBlock$LazyData.load(LazyBlock.java:378)
        at io.prestosql.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:357)
        at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:275)
        at io.prestosql.spi.Page.getLoadedPage(Page.java:261)
        at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:290)
        at io.prestosql.operator.Driver.processInternal(Driver.java:379)
        at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
        at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
        at io.prestosql.operator.Driver.processFor(Driver.java:276)
        at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
        at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
        at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
        at io.prestosql.$gen.Presto_326____20191205_193016_2.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: can not read class org.apache.parquet.format.PageHeader: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@33fc99f6
        at org.apache.parquet.format.Util.read(Util.java:216)
        at org.apache.parquet.format.Util.readPageHeader(Util.java:65)
        at io.prestosql.parquet.reader.ParquetColumnChunk.readPageHeader(ParquetColumnChunk.java:57)
        at io.prestosql.parquet.reader.ParquetColumnChunk.readAllPages(ParquetColumnChunk.java:67)
        at io.prestosql.parquet.reader.ParquetReader.readPrimitive(ParquetReader.java:256)
        at io.prestosql.parquet.reader.ParquetReader.readColumnChunk(ParquetReader.java:310)
        at io.prestosql.parquet.reader.ParquetReader.readBlock(ParquetReader.java:293)
        at io.prestosql.plugin.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:161)
        ... 16 more
Caused by: io.prestosql.hive.$internal.parquet.org.apache.thrift.protocol.TProtocolException: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@33fc99f6
        at org.apache.parquet.format.PageHeader$PageHeaderStandardScheme.read(PageHeader.java:1055)
        at org.apache.parquet.format.PageHeader$PageHeaderStandardScheme.read(PageHeader.java:966)
        at org.apache.parquet.format.PageHeader.read(PageHeader.java:843)
        at org.apache.parquet.format.Util.read(Util.java:213)
        ... 23 more
findepi commented 4 years ago

@sib19 sorry for the delay.

Can you provide SHOW CREATE TABLE (from Hive or Presto) for this table? (I know in theory I can derive this information from the Parquet file itself, but i don't have it automated and the file schema is pretty long. Also, I can derive something different than what you actually have in your metastore.)

Can you provide the failing query? Is it SELECT * FROM table or something else?

sib19 commented 4 years ago

Hi Pio

Thanks for your debugging.

I usually try with select * from table or with all the column names however same error i am getting please find below the creat table script

presto:default> show create table example_db.example_2;
                         Create Table                        
--------------------------------------------------------------
CREATE TABLE hive.example_db.example_2 (                 
    instanceid_userid integer,                               
    instanceid_objecttype varchar,                           
    instanceid_objectid integer,                             
    audit_pos bigint,                                        
    audit_clienttype varchar,                                
    audit_timestamp bigint,                                  
    audit_timepassed bigint,                                 
    audit_experiment varchar,                                
    audit_resourcetype bigint,                               
    metadata_ownerid integer,                                
    metadata_ownertype varchar,                              
    metadata_createdat bigint,                               
    metadata_authorid integer,                               
    metadata_applicationid bigint,                           
    metadata_numcompanions integer,                          
    metadata_numphotos integer,                              
    metadata_numpolls integer,                               
    metadata_numsymbols integer,                             
    metadata_numtokens integer,                              
    metadata_numvideos integer,                              
    metadata_platform varchar,                               
    metadata_totalvideolength integer,                       
    metadata_options array(varchar),                         
    relationsmask bigint,                                    
    userownercounters_user_feed_remove double,               
    userownercounters_user_profile_view double,              
    userownercounters_vote_poll double,                      
    userownercounters_user_send_message double,              
    userownercounters_user_delete_message double,            
    userownercounters_user_internal_like double,             
    userownercounters_user_internal_unlike double,           
    userownercounters_user_status_comment_create double,     
    userownercounters_photo_comment_create double,           
    userownercounters_movie_comment_create double,           
    userownercounters_user_photo_album_comment_create double,
    userownercounters_comment_internal_like double,          
    userownercounters_user_forum_message_create double,      
    userownercounters_photo_mark_create double,              
    userownercounters_photo_view double,                     
    userownercounters_photo_pin_batch_create double,         
    userownercounters_photo_pin_update double,               
    userownercounters_user_present_send double,              
    userownercounters_unknown double,                        
    userownercounters_create_topic double,                   
    userownercounters_create_image double,                   
    userownercounters_create_movie double,                   
    userownercounters_create_comment double,                 
    userownercounters_create_like double,                    
    userownercounters_text double,                           
    userownercounters_image double,                          
    userownercounters_video double,                          
    ownerusercounters_user_feed_remove double,               
    ownerusercounters_user_profile_view double,              
    ownerusercounters_vote_poll double,                      
    ownerusercounters_user_send_message double,              
    ownerusercounters_user_delete_message double,            
    ownerusercounters_user_internal_like double,             
    ownerusercounters_user_internal_unlike double,           
    ownerusercounters_user_status_comment_create double,     
    ownerusercounters_photo_comment_create double,           
    ownerusercounters_movie_comment_create double,           
    ownerusercounters_user_photo_album_comment_create double,
    ownerusercounters_comment_internal_like double,          
    ownerusercounters_user_forum_message_create double,      
    ownerusercounters_photo_mark_create double,              
    ownerusercounters_photo_view double,                     
    ownerusercounters_photo_pin_batch_create double,         
    ownerusercounters_photo_pin_update double,               
    ownerusercounters_user_present_send double,              
    ownerusercounters_unknown double,                        
    ownerusercounters_create_topic double,                   
    ownerusercounters_create_image double,                   
    ownerusercounters_create_movie double,                   
    ownerusercounters_create_comment double,                 
    ownerusercounters_create_like double,                    
    ownerusercounters_text double,                           
    ownerusercounters_image double,                          
    ownerusercounters_video double,                          
    membership_status varchar,                               
    membership_statusupdatedate bigint,                      
    membership_joindate bigint,                              
    membership_joinrequestdate bigint,                       
    owner_create_date bigint,                                
    owner_birth_date integer,                                
    owner_gender integer,                                    
    owner_status integer,                                    
    owner_id_country bigint,                                 
    owner_id_location integer,                               
    owner_is_active integer,                                 
    owner_is_deleted integer,                                
    owner_is_abused integer,                                 
    owner_is_activated integer,                              
    owner_change_datime bigint,                              
    owner_is_semiactivated integer,                          
    owner_region integer,                                    
    user_create_date bigint,                                 
    user_birth_date integer,                                 
    user_gender integer,                                     
    user_status integer,                                     
    user_id_country bigint,                                  
    user_id_location integer,                                
    user_is_active integer,                                  
    user_is_deleted integer,                                 
    user_is_abused integer,                                  
    user_is_activated integer,                               
    user_change_datime bigint,                               
    user_is_semiactivated integer,                           
    user_region integer,                                     
    date varchar,                                            
    objectid integer,                                        
    auditweights_agems double,                               
    auditweights_closed double,                              
    auditweights_ctr_gender double,                          
    auditweights_ctr_high double,                            
    auditweights_ctr_negative double,                        
    auditweights_dailyrecency double,                        
    auditweights_feedowner_recommended_group double,         
    auditweights_feedstats double,                           
    auditweights_friendcommentfeeds double,                  
    auditweights_friendcommenters double,                    
    auditweights_friendlikes double,                         
    auditweights_friendlikes_actors double,                  
    auditweights_hasdetectedtext double,                     
    auditweights_hastext double,                             
    auditweights_ispymk double,                              
    auditweights_israndom double,                            
    auditweights_likersfeedstats_hyper double,               
    auditweights_likerssvd_prelaunch_hyper double,           
    auditweights_matrix double,                              
    auditweights_notoriginalphoto double,                    
    auditweights_numdislikes double,                         
    auditweights_numlikes double,                            
    auditweights_numshows double,                            
    auditweights_onlinevideo double,                         
    auditweights_partage double,                             
    auditweights_partctr double,                             
    auditweights_partsvd double,                             
    auditweights_processedvideo double,                      
    auditweights_relationmasks double,                       
    auditweights_source_live_top double,                     
    auditweights_source_movie_top double,                    
    auditweights_svd_prelaunch double,                       
    auditweights_svd_spark double,                           
    auditweights_userage double,                             
    auditweights_userowner_create_comment double,            
    auditweights_userowner_create_image double,              
    auditweights_userowner_create_like double,               
    auditweights_userowner_image double,                     
    auditweights_userowner_movie_comment_create double,      
    auditweights_userowner_photo_comment_create double,      
    auditweights_userowner_photo_mark_create double,         
    auditweights_userowner_photo_view double,                
    auditweights_userowner_text double,                      
    auditweights_userowner_unknown double,                   
    auditweights_userowner_user_delete_message double,       
    auditweights_userowner_user_feed_remove double,          
    auditweights_userowner_user_forum_message_create double, 
    auditweights_userowner_user_internal_like double,        
    auditweights_userowner_user_internal_unlike double,      
    auditweights_userowner_user_present_send double,         
    auditweights_userowner_user_profile_view double,         
    auditweights_userowner_user_send_message double,         
    auditweights_userowner_user_status_comment_create double,
    auditweights_userowner_video double,                     
    auditweights_userowner_vote_poll double,                 
    auditweights_x_actorsrelations bigint,                   
    auditweights_likerssvd_spark_hyper double,               
    auditweights_source_promo double                         
)                                                           
WITH (                                                      
    external_location = '/user/hive/example_2',      
    format = 'PARQUET'                                       
)                                                           
(1 row)
findepi commented 4 years ago

i did:

it didn't fail for me.

@sib19,

sib19 commented 4 years ago

Hi Pio

Thanks for your quick reply and testing the data.

This is very interesting, i used to setup presto-server326 using tar.gz hope this file was included all the dependant jar files. I can’t able to do the build from your commit , kindly provide tar.gz format of installation i will check again with select query.

Moreover many hive tables are having multiple active parquet files under a hive table directory.

Please provide tar.gz installation format that would be more helpful ....

findepi commented 4 years ago

@sib19 you can build a .tar.gz from source by running this:

./mvnw -pl '!presto-server-rpm,!presto-docs,!presto-proxy,!presto-testing-server-launcher,!presto-verifier' clean install -TC1 -DskipTests

Moreover many hive tables are having multiple active parquet files under a hive table directory.

Sure, this is typical. Can you somehow identify which file is causing the problem? (Ideally this should be part of the exception message, but it isn't 😞 )

eg you can remove half of the files and see if the problem persist. continue this down until you identify one file that's causing the problem

sib19 commented 4 years ago

Pio

Due to my restricted environment i can’t able to download maven jars for this build , after that only we will make tar.gz format as per your command.

Please provide installation file as like under url prestosql.io/download.html.

Yes sure i will try the select against each file

Thanks

findepi commented 4 years ago

Sure, I made the .tar.gz snapshot build temporarily available for you -- https://www.dropbox.com/sh/ilz4yoqg7wtwg45/AAD_-SPuzPlX8bXvHc43M8eMa?dl=0

sib19 commented 4 years ago

Thanks Pio, you are so great..

Please avail the file for another 5/6 hours I will download the file from Dropbox In my home network . Due to more restrictions on company environment It doesn’t allow Dropbox access. Thanks

ebyhr commented 4 years ago

Unfortunately, I couldn't reproduce this on 326. If you still face the issue on the above .tar.gz, could you share the DDL on Spark(Hive) to confirm the full table properties? Also, Haddop and Hive versions may be helpful.

findepi commented 4 years ago

@sib19 i deleted the file you attached previously, in case it contains anything sensitive. in the future, you can share a file confidently with me over slack -- https://prestosql.io/slack.html

in the meantime, i believe the issue is not Presto's fault, but it's some corrupted Parquet file. Let me close the issue. You can still reopen it or comment here. Or we can talk on slack.

hashhar commented 4 years ago

@sib19 If you're still here can you please confirm if the data source you were querying has some ETL job running on it that re-writes files? Or if you were using some kind of caching layer (eg. Rubix?).

I ran into this issue today and in my case this happened (most probably) due to Rubix. After dropping the cached files for that table this error went away. I ran into this with Presto 333.

sib19 commented 4 years ago

Hi hashhar Thanks for the information yes I experienced same kind of problem.. underlying data platform jars doesn’t support the Huge table data read, the vendor has fixed the issue no changes on presto part...

erikcw commented 2 years ago

I'm experiencing the same problam on Trino 381 (on kubernetes using the community container image). Everything works fine until I activate caching for the hive connector. Then after a few successful queries, they start failing with:

io.trino.spi.TrinoException: Failed reading parquet data; source= s3://*************/master-db/master-email-20201002/part-00617-95db1ec9-45be-47c3-9eb8-5d2906ed5389-c000.snappy.parquet; can not read class org.apache.parquet.format.PageHeader: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@2e0bf6f0
    at io.trino.plugin.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:217)
    at io.trino.spi.block.LazyBlock$LazyData.load(LazyBlock.java:400)
    at io.trino.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:379)
    at io.trino.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:286)
    at io.trino.operator.project.DictionaryAwarePageProjection$DictionaryAwarePageProjectionWork.setupDictionaryBlockProjection(DictionaryAwarePageProjection.java:211)
    at io.trino.operator.project.DictionaryAwarePageProjection$DictionaryAwarePageProjectionWork.lambda$getResult$0(DictionaryAwarePageProjection.java:197)
    at io.trino.spi.block.LazyBlock$LazyData.load(LazyBlock.java:400)
    at io.trino.spi.block.LazyBlock$LazyData.getFullyLoadedBlock(LazyBlock.java:379)
    at io.trino.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:286)
    at io.trino.operator.project.PageProcessor$ProjectSelectedPositions.processBatch(PageProcessor.java:345)
    at io.trino.operator.project.PageProcessor$ProjectSelectedPositions.process(PageProcessor.java:208)
    at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
    at io.trino.operator.WorkProcessorUtils.lambda$flatten$7(WorkProcessorUtils.java:296)
    at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:338)
    at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
    at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:325)
    at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
    at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:325)
    at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
    at io.trino.operator.WorkProcessorUtils.lambda$flatten$7(WorkProcessorUtils.java:296)
    at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:338)
    at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
    at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:325)
    at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
    at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:240)
    at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$3(WorkProcessorUtils.java:219)
    at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
    at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:240)
    at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$4(WorkProcessorUtils.java:234)
    at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:391)
    at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:150)
    at io.trino.operator.Driver.processInternal(Driver.java:410)
    at io.trino.operator.Driver.lambda$process$10(Driver.java:313)
    at io.trino.operator.Driver.tryWithLock(Driver.java:698)
    at io.trino.operator.Driver.process(Driver.java:305)
    at io.trino.operator.Driver.processForDuration(Driver.java:276)
    at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1092)
    at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
    at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:488)
    at io.trino.$gen.Trino_381____20220523_050344_2.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: can not read class org.apache.parquet.format.PageHeader: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@2e0bf6f0
    at org.apache.parquet.format.Util.read(Util.java:365)
    at org.apache.parquet.format.Util.readPageHeader(Util.java:132)
    at org.apache.parquet.format.Util.readPageHeader(Util.java:127)
    at io.trino.parquet.reader.ParquetColumnChunk.readPageHeader(ParquetColumnChunk.java:78)
    at io.trino.parquet.reader.ParquetColumnChunk.readAllPages(ParquetColumnChunk.java:91)
    at io.trino.parquet.reader.ParquetReader.createPageReader(ParquetReader.java:385)
    at io.trino.parquet.reader.ParquetReader.readPrimitive(ParquetReader.java:365)
    at io.trino.parquet.reader.ParquetReader.readColumnChunk(ParquetReader.java:441)
    at io.trino.parquet.reader.ParquetReader.readBlock(ParquetReader.java:424)
    at io.trino.plugin.hive.parquet.ParquetPageSource$ParquetBlockLoader.load(ParquetPageSource.java:211)
    ... 42 more
Caused by: io.trino.hive.$internal.parquet.org.apache.thrift.protocol.TProtocolException: Required field 'uncompressed_page_size' was not found in serialized data! Struct: org.apache.parquet.format.PageHeader$PageHeaderStandardScheme@2e0bf6f0
    at org.apache.parquet.format.PageHeader$PageHeaderStandardScheme.read(PageHeader.java:1108)
    at org.apache.parquet.format.PageHeader$PageHeaderStandardScheme.read(PageHeader.java:1019)
    at org.apache.parquet.format.PageHeader.read(PageHeader.java:896)
    at org.apache.parquet.format.Util.read(Util.java:362)
    ... 51 more

Here is my hive catalog config WITHOUT caching enabled (queries work just fine):

                  hive.metastore-refresh-interval=50ms
                  connector.name=hive-hadoop2
                  hive.metastore-cache-ttl=50ms
                  hive.non-managed-table-writes-enabled = true
                  hive.hdfs.impersonation.enabled = true
                  hive.recursive-directories = true
                  hive.parquet.use-column-names = true
                  hive.metastore = glue

And with WITH caching enabled (causes the exception):

                  hive.metastore-refresh-interval=50ms
                  connector.name=hive-hadoop2
                  hive.metastore-cache-ttl=50ms
                  hive.non-managed-table-writes-enabled = true
                  hive.hdfs.impersonation.enabled = false
                  hive.recursive-directories = true
                  hive.parquet.use-column-names = true
                  hive.metastore = glue
                  hive.cache.enabled=true
                  hive.cache.location=/mnt/trino-cache
CREATE EXTERNAL TABLE `master_email`(
  `email` string COMMENT '', 
  `sha256_lower` string COMMENT '')
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
  's3://***********/master-db/master-email-20201002/'
TBLPROPERTIES (
  'COLUMN_STATS_ACCURATE'='false', 
  'STATS_GENERATED_VIA_STATS_TASK'='workaround for potential lack of HIVE-12730', 
  'has_encrypted_data'='false', 
  'last_modified_by'='hadoop', 
  'last_modified_time'='1601685042', 
  'numFiles'='0', 
  'numRows'='1700455799', 
  'spark.sql.create.version'='2.2 or prior', 
  'spark.sql.sources.schema.numParts'='1', 
  'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[{\"name\":\"email\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\"}},{\"name\":\"sha256_lower\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\"}}]}', 
  'totalSize'='0', 
  'transient_lastDdlTime'='1601685042')
% parquet-tools meta /tmp/part-00617-95db1ec9-45be-47c3-9eb8-5d2906ed5389-c000.snappy.parquet                                                                                                                                                                              <<<
file:         file:/tmp/part-00617-95db1ec9-45be-47c3-9eb8-5d2906ed5389-c000.snappy.parquet
creator:      parquet-mr version 1.10.1 (build 0f9df43deffb88fd7f8666d63691842330f46bd9)
extra:        org.apache.spark.version = 3.0.0
extra:        org.apache.spark.sql.parquet.row.metadata = {"type":"struct","fields":[{"name":"email","type":"string","nullable":true,"metadata":{}},{"name":"sha256_lower","type":"string","nullable":true,"metadata":{}}]}

file schema:  spark_schema
--------------------------------------------------------------------------------
email:        OPTIONAL BINARY L:STRING R:0 D:1
sha256_lower: OPTIONAL BINARY L:STRING R:0 D:1

row group 1:  RC:1574299 TS:149868008 OFFSET:4
--------------------------------------------------------------------------------
email:         BINARY SNAPPY DO:0 FPO:4 SZ:30340825/42801253/1.41 VC:1574299 ENC:PLAIN,BIT_PACKED,RLE ST:[min: "2357259a5a25d8f40e6ae87224a1c13f *******@******.com ""[vp[e>0\'_f`j(+{{$lt@zd_s6&rir", max: ******@******m.net, num_nulls: 0]
sha256_lower:  BINARY SNAPPY DO:0 FPO:30340829 SZ:103870684/107066755/1.03 VC:1574299 ENC:PLAIN,BIT_PACKED,RLE ST:[min: **********, max: ********.us", num_nulls: 0]

row group 2:  RC:1259161 TS:117656247 OFFSET:134211513
--------------------------------------------------------------------------------
email:         BINARY SNAPPY DO:0 FPO:134211513 SZ:20885557/32040363/1.53 VC:1259161 ENC:PLAIN,BIT_PACKED,RLE ST:[min: ********@comcast.com, max: **********@yahoo.com, num_nulls: 0]
sha256_lower:  BINARY SNAPPY DO:0 FPO:155097070 SZ:83155069/85615884/1.03 VC:1259161 ENC:PLAIN,BIT_PACKED,RLE ST:[min: com", max: ystic@msn.com", num_nulls: 0]
pritamkdey commented 1 year ago

Facing the same issue here. Is there any known resolution to this apart from deleting the cache everytime?

suryanshagnihotri commented 11 months ago

I am also seeing the same error in trino 389 even when hive cache is not enabled. @hashhar , do we have any resolution for this?