druid-io / pydruid

A Python connector for Druid
Other
505 stars 194 forks source link

"do_ping" broken in last pydruid release, can't add a Druid as a new Database in Superset, need a pydruid release #294

Closed kyhtsang closed 1 year ago

kyhtsang commented 1 year ago

Trying to add Druid as a new database in Superset results in an error:

ERROR: (builtins.NoneType) None

This is caused by https://github.com/druid-io/pydruid/pull/284, which got released along with pydruid 0.6.3.

This got fixed in https://github.com/druid-io/pydruid/pull/292, but there needs to be a release of pydruid before Superset will pick it up. Can a release containing this fix be made and published to https://pypi.org/project/pydruid/?

dclim commented 1 year ago

@mistercrunch Is this something you could help with? Every release in the past 4 years was done by you, so I'm hoping you may know something about how to do this :)

betodealmeida commented 1 year ago

I can make a new release, I'll work on it.

dclim commented 1 year ago

Sweet, thanks @betodealmeida !

gianm commented 1 year ago

It looks like the change from #292 (removing a text wrapper) was reverted in #290 (which re-added the wrapper). Don't be fooled by the PR numbers: 290 was merged after 292.

I'm not sure which one is correct? I am not familiar with SQLAlchemy. But, the current code in master, & in 0.6.4, does have the text wrapper. So if it's correct to not have it, then we'd want to remove it from master first.

Usiel commented 1 year ago

I think the removal of the text wrapper is correct. For me the connection checkout from a pool is broken with the latest change from #290:

from sqlalchemy import create_engine

import logging

logging.basicConfig()
logging.getLogger('sqlalchemy.pool').setLevel(logging.INFO)

engine = create_engine(
    'druid://localhost: 8082/druid/v2/sql',
    pool_pre_ping=True,
    # everything below here is to force sqlalchemy to execute a pre-ping
    pool_size=1,
    max_overflow=0
)
engine.connect().close()
# pre-ping should be triggered for this checkout
engine.connect()

Output indicates that the ping fails, while it succeeds without the text wrapper:

INFO:sqlalchemy.pool.impl.QueuePool:Disconnection detected on checkout, invalidating all pooled connections prior to current timestamp (reason: InvalidatePoolError())
INFO:sqlalchemy.pool.impl.QueuePool:Invalidate connection <pydruid.db.api.Connection object at 0x10e52da90> (reason: InvalidatePoolError:())

@betodealmeida is there a different use-case that is broken by #292 and fixed by #290?

nytai commented 1 year ago

Second this issue. Looking at this PR it appears that the fix in #290 is to address usage with sqlalchemy 1.4, but it breaks when using SQLAlchemy==1.3.24. A fix would probably be to check the version of sqlalchemy and add the text wrapper based on that.

Usiel commented 1 year ago

I executed the manual test above (see my previous comment) with both SQLAlchemy<1.4.0 and ==1.4.0. In both cases I get the same results: With the text wrapper the connection fails, without the text wrapper it succeeds.

Similarly on Superset with SQLAlchemy>=1.4.0 the connection test fails for Druid DBs.

So imo reverting #290 should do the trick.

betodealmeida commented 1 year ago

Working on this.

betodealmeida commented 1 year ago

Fixed in #295, will work on a new patch release.

betodealmeida commented 1 year ago

This should be fixed in 0.6.5, just out.