apache / carbondata

High performance data store solution
carbondata.apache.org
Apache License 2.0
1.43k stars 704 forks source link

[ SDK PERFORMANCE] The execution of the show tables command takes a long time. #4295

Open wxp0329 opened 1 year ago

wxp0329 commented 1 year ago

image

As shown in the above figure, CarbonShowTablesCommand obtains metadata from metastore for each table. Currently, when there are 180,000 tables, it takes a long time (about 1 hours) to run the show tables command in spark-sql shell, which needs to be optimized. When the filter function is not invoked, it takes about 12 seconds to obtain 180,000 tables by running the show tables command.As shown in the following figure. image

kevinjmh commented 1 year ago

Maybe we can try to get tables' info in one batch instead of one by one

  def getTable(db: String, table: String): CatalogTable

  def getTablesByName(db: String, tables: Seq[String]): Seq[CatalogTable]
wxp0329 commented 1 year ago

Maybe we can try to get tables' info in one batch instead of one by one

  def getTable(db: String, table: String): CatalogTable

  def getTablesByName(db: String, tables: Seq[String]): Seq[CatalogTable]

hi, when and which version can solve the problem?