Open AntonOvsyannikov opened 1 year ago
The following snippet (assuming hive.thriftserver running on 10000 port)
engine = create_engine('hive://localhost:10000/default') metadata = MetaData(bind=engine) metadata.reflect()
causes
sqlalchemy.exc.OperationalError: (pyhive.exc.OperationalError) TExecuteStatementResp(status=TStatus(statusCode=3, ... errorMessage="Error running query: org.apache.spark.sql.AnalysisException: Table or view not found: default; line 1 pos 9;\n'DescribeRelation false, [col_name#7203, data_type#7204, comment#7205]\n+- 'UnresolvedTableOrView [default], DESCRIBE TABLE, true\n"), operationHandle=None) [SQL: DESCRIBE default]
This is because of possibly incorrect implementation of get_table_names in sqlalchemy_hive.py::376 (v0.7.1.dev0)
get_table_names
sqlalchemy_hive.py::376
def get_table_names(self, connection, schema=None, **kw): query = 'SHOW TABLES' ... return [row[0] for row in connection.execute(text(query))]
while in row[0] in spark sql is schema name
SHOW TABLES; +-----------+------------+--------------+--+ | database | tableName | isTemporary | +-----------+------------+--------------+--+ | default | sam | false | | default | sam1 | false | | default | suj | false | +-----------+------------+--------------+--+
so proposed fix is
def get_table_names(self, connection, schema=None, **kw): query = 'SHOW TABLES' if schema: query += ' IN ' + self.identifier_preparer.quote_identifier(schema) return [row[1] for row in connection.execute(text(query))]
Same issue here with pyhive 7.0 and sqlalchemy 2.0.
The following snippet (assuming hive.thriftserver running on 10000 port)
causes
This is because of possibly incorrect implementation of
get_table_names
insqlalchemy_hive.py::376
(v0.7.1.dev0)while in row[0] in spark sql is schema name
so proposed fix is