apache / drill

Apache Drill is a distributed MPP query layer for self describing data
https://drill.apache.org/
Apache License 2.0
1.93k stars 980 forks source link

DRILL-8436: Upgrade Hadoop 3.2.4 → 3.3.6 #2821

Closed jnturton closed 1 year ago

jnturton commented 1 year ago

DRILL-8436: Upgrade Hadoop 3.2.4 → 3.3.6

Description

Hadoop is upgraded to 3.3.6.

Documentation

N/A

Testing

Existing unit tests, manual testing of Drill HTTP services. Manual testing Drill JDBC driver.

Rebased onto #2823.

jnturton commented 1 year ago

So, this seems to work but not in JDK 8 😒

pjfanning commented 1 year ago

So, this seems to work but not in JDK 8 😒

In JDK 8, it can't find io/netty/handler/codec/http/HttpRequest. Maybe, we need to add an explicit dependency on the io.netty:netty-codec-http jar

jnturton commented 1 year ago

I've just set this PR to Draft because I rediscovered a problem in the Drill JDBC driver. I'll paste a chat message I sent to @vvysotskyi a few months back below, to reveal the nature of the problem. I'm sure it's ultimately fixable, but I don't know of an elegant fix yet.

Hi Vova! I decided to try upgrading Drill's Hadoop libs to 3.3.5. Things are working but there is a problem in the Drill JDBC fat jar. There, the shade plugin relocates Hadoop to underneath oadd as usual but now there are class names present in the core-default.xml file in hadoop-common.jar which are not updated by the shade plugin. The result is that the JDBC driver is broken. While the shade plugin can update some kinds of text config files, it doesn't appear that it can update arbitrary XML config like core-default.xml. I thought of including our own manually updated copy of core-default.xml in exec/jdbc-all/src/resources and trying to make sure the shade plugin picks that one instead of the one in hadoop-common.jar. My only reservation is that introducing this copy creates a maintenance burden for the future so I thought to ask you if you have any ideas...

jnturton commented 1 year ago

I've got the JDBC driver working by bundling a core-site.xml file in it that handles the relocation of org.apache.hadoop to oadd.org.apache.hadoop.

cgivre commented 1 year ago

I thought I was reviewing the other PR for the library updates. Could we rebase this on master once that has been merged?

jnturton commented 1 year ago

I thought I was reviewing the other PR for the library updates. Could we rebase this on master once that has been merged?

Done.