Closed ebessah closed 4 years ago
Thanks for the report! There is a if
in the latest master code, https://github.com/cloudera/hue/blob/master/apps/beeswax/gen-py/TCLIService/ttypes.py#L4457
so the line to update to would be more like:
oprot.writeString(self.statement)
and you are proposing?
oprot.writeString(self.statement.encode('utf-8'))?
py-hive
generated Thrift can conflict with the Hue one, so above clean-up is needed but we should find a proper way to avoid this.
Using the latest master code, and looking at the problem again there seem to be a discrepancy between py-hive
generated Thrift, using current thrift compiler (0.13.0) and the one hue maintains, generated with 0.9.3.
Any reason why hive is still maintaining an older version of thrift generated py-hive code?
Thrift version 0.9.3 compiler
if self.statement is not None:
oprot.writeFieldBegin('statement', TType.STRING, 2)
if sys.version_info[0] > 2:
oprot.writeBinary(self.statement)
else:
oprot.writeString(self.statement) - Line 4457
oprot.writeFieldEnd()
Thrift version 0.13.0 compiler
if self.statement is not None:
oprot.writeFieldBegin('statement', TType.STRING, 2)
oprot.writeString(self.statement.encode('utf-8') if sys.version_info[0] == 2 else self.statement) - Line 3967
oprot.writeFieldEnd()
Hum indeed. IIRC no, just 0.9 is the default one coming via Ubuntu packages but it is very old now. Feel free to send a PR with the recompile with 0.13, or I can send one tomorrow!
Perfect! Would do. Thanks for the response
When connecting to Databricks clusters from Hue using SQLAlchemy interface and the Hive connector, we received
TypeError: 'unicode' does not have the buffer interface
.After some days of debugging, we realised that the beeswax application which is installed and configured as part of hue and enabled you to perform queries on Apache Hive, had a custom autogenerated thrift python code that allows integration with HiveServer2. Because we were using the hive connector, anytime hue was about to establish connection to Databricks or run a statement, the custom thrift library tried to encode the SQL statements which failed with the
TypeError
as below:We understood clearly that
line 4460
in/usr/share/hue/apps/beeswax/gen-py/TCLIService/ttypes.py
did not handle properly the encoding of unicode string, when writing from string to binary; using python version 2.7.Line 4460 in /usr/share/hue/apps/beeswax/gen-py/TCLIService/ttypes.py
Instead of:
To go around this problem, we upgraded the pip version which comes with Hue and then installed the databricks-dbapi[sqlalchemy] package which then installs other collected packages with a compatible and updated thrift library able to handle unicode encoding:
We go ahead to remove the native hue thrift library so that connection will fall over our newly installed thrift library.
The complete Dockerfile looks like this:
How our hue.ini config looked like
So we used the hive interpreter in the Hue config which PyHive extends. This is what the
databricks+pyhive
dialect/driver which come by installing databricks-dbapi uses with SQLAlchemy to establish connection to Databricks.