Open eshishki opened 5 months ago
@eshishki What version are you using? could you add "enable_iceberg_metadata_cache" = "true"
in properties of create external catalog
stmt and try again?
@DorianZheng i'm on 3.2.8 and my iceberg catalog is created with
"enable_iceberg_metadata_cache" = "true",
"iceberg_table_cache_ttl_sec"= "60",
"iceberg_meta_cache_ttl_sec" = "60");
every show statistics invocation i see for every table in log
INFO (starrocks-mysql-nio-pool-361|205129) [MetadataMgr.lambda$static$0():111] Evict cache due to EXPIRED and deregister query-level connector metadata on query id: 8c37609b-2f60-11ef-bf87-
02f604b9c98d
INFO (starrocks-mysql-nio-pool-361|205129) [MetadataMgr$QueryMetadatas.getConnectorMetadata():96] Succeed to register query level connector metadata [catalog:iceberg, queryId: f3ee551c-2f95
-11ef-bf87-02f604b9c98d]
INFO (starrocks-mysql-nio-pool-361|205129) [CatalogUtil.loadFileIO():302] Loading custom FileIO implementation: com.starrocks.connector.iceberg.io.IcebergCachingFileIO
INFO (starrocks-mysql-nio-pool-361|205129) [BaseMetastoreTableOperations.refreshFromMetadataLocation():199] Refreshing table metadata from new version: s3://xxx/03167-0b00fc28-344d-4bd6-bfbc-3073eec5ae78.metadata.json
INFO (starrocks-mysql-nio-pool-361|205129) [CatalogUtil.loadFileIO():302] Loading custom FileIO implementation: org.apache.iceberg.aws.s3.S3FileIO
INFO (starrocks-mysql-nio-pool-361|205129) [BaseMetastoreCatalog.loadTable():67] Table loaded by catalog: xxx
and then there is also a lot of warnings for each io request
2024-06-21 06:21:47.047Z WARN (Finalizer|3) [ResolvingFileIO.finalize():253] Unclosed ResolvingFileIO instance created by:
org.apache.iceberg.io.ResolvingFileIO.<init>(ResolvingFileIO.java:77)
com.starrocks.connector.iceberg.io.IcebergCachingFileIO.initialize(IcebergCachingFileIO.java:109)
org.apache.iceberg.CatalogUtil.loadFileIO(CatalogUtil.java:325)
org.apache.iceberg.aws.glue.GlueTableOperations.initializeFileIO(GlueTableOperations.java:223)
org.apache.iceberg.aws.glue.GlueTableOperations.io(GlueTableOperations.java:115)
org.apache.iceberg.aws.glue.GlueCatalog.newTableOps(GlueCatalog.java:246)
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:46)
com.starrocks.connector.iceberg.glue.IcebergGlueCatalog.getTable(IcebergGlueCatalog.java:77)
com.starrocks.connector.iceberg.CachingIcebergCatalog.getTable(CachingIcebergCatalog.java:96)
com.starrocks.connector.iceberg.IcebergMetadata.getTable(IcebergMetadata.java:244)
com.starrocks.connector.CatalogConnectorMetadata.getTable(CatalogConnectorMetadata.java:117)
com.starrocks.server.MetadataMgr.lambda$getTable$3(MetadataMgr.java:325)
java.base/java.util.Optional.map(Optional.java:260)
com.starrocks.server.MetadataMgr.getTable(MetadataMgr.java:325)
com.starrocks.sql.ast.ShowAnalyzeJobStmt.showAnalyzeJobs(ShowAnalyzeJobStmt.java:83)
com.starrocks.qe.ShowExecutor.handleShowAnalyzeJob(ShowExecutor.java:2599)
com.starrocks.qe.ShowExecutor.execute(ShowExecutor.java:385)
com.starrocks.qe.StmtExecutor.handleShow(StmtExecutor.java:1551)
com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:667)
com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:413)
com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:608)
com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:915)
com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:69)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base/java.lang.Thread.run(Thread.java:833)
Why do you set this "iceberg_table_cache_ttl_sec"= "60"
? It will evict table cache too soon.
@DorianZheng i want to see fresh data every minute, will iceberg_meta_cache_ttl_sec be enough?
anyway that's just uncovers the issue i think
upgraded to 3.3 and now i can't get results for show analyze job
i see errors in fe log like 'slow edit log write'
my fe node sits in full gc
i tried doing show analyze job for non existing db, to see if i'll hit https://github.com/StarRocks/starrocks/blob/main/fe/fe-core/src/main/java/com/starrocks/sql/ast/ShowAnalyzeJobStmt.java#L76 early, but no, it still tries to refresh whole catalog, does a bunch of fe log writes and hangs
@DorianZheng so i've verified again that i do have iceberg caching catalog even if i run show analyze job after starrocks refreshed iceberg metadata, we again refresh it
i've set ADMIN SET FRONTEND CONFIG ("catalog_metadata_cache_size" = "5000") and now i at can at least run show analyse job to completion, but i don't understand why it invalidates catalog =\
i mean all it does is gets table name and db name, and my show tables
run instantly in this regard
okay, i've recreated catalog with default iceberg_table_cache_ttl_sec and second run is now instant
yeah, the problem is that i now scan cached snapshot during query
okay, if i cache all tables, when it runs perfectly, but now my cache is leaking if i don't cache, then show analyse job does IO and slow
why you think your cache is leaking ? the table catalog is catalog level. The deleted cache is only at the query level. so, what's your current issue. Is it still a slow query when enable metadata cache?
my problem is that i'm using glue catalog and collect stats on all tables
when i do show analyze jobs
or show analyze status
it refreshes all the tables,
just to print the table name which it already knows from the job!
it takes minutes it get any result while it refreshes all the metadata, i get warnings from ResolvingFileIO finalyze, i get GC pressure
can't live like this
We have ~500 tables in iceberg catalog and every
show analyze job
takes a minute or too while starrocks refreshes table's meta.