druid-io / pydruid

A Python connector for Druid
Other
506 stars 194 forks source link

sqlalchemy.exc.ResourceClosedError: result object does not return rows #254

Open pedro93 opened 3 years ago

pedro93 commented 3 years ago

Hello,

When trying to use pydruid via SQLAlchemy to list tables names on a local quickstart demo druid installation, I get the following exception:

 File "/home/pedro/dev/forks/datahub/metadata-ingestion/src/datahub/ingestion/source/sql_common.py", line 206, in get_workunits
    for table in inspector.get_table_names(schema):
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 266, in get_table_names
    return self.dialect.get_table_names(
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/pydruid/db/sqlalchemy.py", line 152, in get_table_names
    return [row.TABLE_NAME for row in result]
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 938, in __iter__
    return self._iter_impl()
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 638, in _iter_impl
    return self._iterator_getter(self)
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 1162, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 361, in _iterator_getter
    make_row = self._row_getter
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 1162, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 320, in _row_getter
    keymap = metadata._keymap
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/cursor.py", line 1197, in _keymap
    self._we_dont_return_rows()
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/cursor.py", line 1178, in _we_dont_return_rows
    util.raise_(
  File "/home/pedro/dev/forks/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 198, in raise_
    raise exception
sqlalchemy.exc.ResourceClosedError: This result object does not return rows. It has been closed automatically.

This is the code snippet I'm trying to run connecting to druid with the url: sql_alchemy_url=druid://localhost:8082/druid/v2/sql/:

        engine = create_engine(url, **sql_config.options)
        inspector = reflection.Inspector.from_engine(engine)
        for schema in inspector.get_schema_names():
            print("Processing " + schema);
            if not sql_config.schema_pattern.allowed(schema):
                self.report.report_dropped(schema)
                continue

            for table in inspector.get_table_names(schema):
                schema, table = sql_config.standardize_schema_table_names(schema, table)
                dataset_name = sql_config.get_identifier(schema, table)
                self.report.report_table_scanned(dataset_name)

Essentially connecting to druid and via fine-grained SQLAlchemy reflection getting the database schema & associated tables for each database.

It seems as though Pydruid's implementation of get_table_names is not working as expected?

Is this a known bug or implementation detail?