housepower / ClickHouse-Native-JDBC

ClickHouse Native Protocol JDBC implementation
https://housepower.github.io/ClickHouse-Native-JDBC/
Apache License 2.0
523 stars 145 forks source link

PySpark hanging using this connector #376

Open ruiyang2015 opened 2 years ago

ruiyang2015 commented 2 years ago

using the official one, seems working trying to switch to the native jdbc, using pyspark to read it, it is hanging without showing any error message: only saw following log

I1009 02:53:09.454974 46 jdbc.py:161] open connection {self._host} {self._database} {self._port} log4j:WARN No appenders could be found for logger (com.github.housepower.jdbc.ClickhouseJdbcUrlParser). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

here is the pyspark snippet:

   pgDF=spark.read.format('jdbc').option('driver', 'com.github.housepower.jdbc.ClickHouseDriver').option('url', 'jdbc:clickhouse://127.0.0.1:9000').option('user',user).option('password',password).option('dbtable',dbtable).load()

also tried to use the python JayDeBeAPI to connect to the db, also hanging.

the same code works with the official driver, wondering if I miss anything here. the official is a bit too big in size, this one seems quite small and does not require as many dep, if this works, really want to switch to this driver.

pan3793 commented 2 years ago

Sorry, not familiar with pySpark.

the official is a bit too big in size

The official driver also provides shaded jars

https://repo1.maven.org/maven2/ru/yandex/clickhouse/clickhouse-jdbc/0.3.1-patch/clickhouse-jdbc-0.3.1-patch-shaded.jar

<dependency>
    <groupId>ru.yandex.clickhouse</groupId>
    <artifactId>clickhouse-jdbc</artifactId>
    <version>0.3.1-patch</version>
    <classfier>shaded</classfier>
</dependency>
ruiyang2015 commented 2 years ago

thanks, will give that a try.