cloudera / hue

Open source SQL Query Assistant service for Databases/Warehouses
https://cloudera.com
Apache License 2.0
1.17k stars 366 forks source link

Spark-SQL - Hiveserver2 interface does not show tables as prompt #2973

Closed divincode closed 1 year ago

divincode commented 2 years ago

Is the issue already present in https://github.com/cloudera/hue/issues or discussed in the forum https://discourse.gethue.com? No

Describe the bug: Use spark-sql with hiveserver2 interface we can see the following error

sparksql-livy

while in livy interface table spark_sql-hive

Steps to reproduce it? Steps are simple.

Hue version or source? (e.g. open source 4.5, CDH 5.16, CDP 1.0...). System info (e.g. OS, Browser...). 4.10

divincode commented 2 years ago

Went deep into the code the error showing -

RCA -

Traceback (most recent call last): File "/usr/lib/hue/apps/beeswax/src/beeswax/api.py", line 129, in _autocomplete tables_meta = db.get_tables_meta(database=database) File "/usr/lib/hue/apps/beeswax/src/beeswax/server/dbms.py", line 381, in get_tables_meta tables = self._get_tables_via_sparksql(database, identifier) File "/usr/lib/hue/apps/beeswax/src/beeswax/server/dbms.py", line 432, in _get_tables_via_sparksql for row in result.rows()

Let me give context fxn call -> api call -> /api/editor/autocomplete/(database_name) - > def _autocomplete(db, database) (beeswax/src/beeswax/api.py) - > def get_tables_meta() (beeswax/src/beeswax/server/dbms.py) -> def _get_tables_via_sparksql() (beeswax/src/beeswax/server/dbms.py) -> error in this fxn def fetch() (beeswax/src/beeswax/server/dbms.py) -> def fetch() (beeswax/src/beeswax/server/hive_server2_lib.py) returns and wraps ResultCompatible(class) Last fxn in the trace - def fetch_data() beeswax/src/beeswax/server/hive_server2_lib.py) returns the results,schema in a wrapper of HiveServerDataTable(class)

TFetchResultsResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), results=TRowSet(columnCount=None, binaryColumns=None, rows=[], columns=[TColumn(i32Val=None, byteVal=None, i16Val=None, i64Val=None, stringVal=TStringColumn(nulls='\x00', values=['call_center', 'catalog_page', 'catalog_returns', 'catalog_sales', 'customer', 'customer_address', 'customer_demographics', 'date_dim', 'household_demographics', 'income_band', 'inventory', 'item', 'promotion', 'reason', 'ship_mode', 'store', 'store_returns', 'store_sales', 'time_dim', 'warehouse', 'web_page', 'web_returns', 'web_sales', 'web_site']), boolVal=None, doubleVal=None, binaryVal=None)], startRowOffset=0), hasMoreRows=False)

As we can see in values we get the tables data eg ('catalog_page', 'catalog_returns', 'catalog_sales', 'customer', 'customer_address', 'customer_demographics', ) but there seems to be wrong implementation multiple wrappers.

the main part of this def _get_tables_via_sparksql() (beeswax/src/beeswax/server/dbms.py) - hql = "SHOW TABLES IN %s" % database query = hql_query(hql) handle = self.execute_and_wait(query, timeout_sec=timeout) result = self.fetch(handle, rows=5000) return [{ 'name': row[1], 'type': 'VIEW' if row[2] else 'TABLE', 'comment': '' } for row in result.rows()

Now the result is a beeswax.server.hive_server2_lib.ResultCompatible object and .rows() is generators fxn which is bounded by the inner class HiveServerDataTable.Thats why we get IndexError: list index out of range .

divincode commented 2 years ago

@Harshg999 any suggestions.

divincode commented 2 years ago

https://github.com/cloudera/hue/pull/3012 created a pr for the same.

divincode commented 2 years ago

@romainr @Harshg999 Do guide.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity and is not "roadmap" labeled or part of any milestone. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity and is not "roadmap" labeled or part of any milestone. Remove stale label or comment or this will be closed in 5 days.

Harshg999 commented 1 year ago

Hi @divincode, the PR with your changes is closed. Will you be reopening it and take this issue up?

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity and is not "roadmap" labeled or part of any milestone. Remove stale label or comment or this will be closed in 5 days.