relferreira / metabase-sparksql-databricks-driver

GNU Affero General Public License v3.0
31 stars 32 forks source link

Build driver and include in Metabase 0.46 Docker image #19

Closed bjgbeelen closed 1 year ago

bjgbeelen commented 1 year ago

Related to #18 and #17

This updates the Dockerfile to cope with new Metabase 0.46 build processes.

This should only be merged after merging #17

I couldn't figure out how to make the unit tests work quickly and don't have more time right now. So I thought I at least provide this change so we have build process for >= 0.46

Tested it manually though (which lead to a change request in #18)

bjgbeelen commented 1 year ago

@relferreira could you check this one please? I just merged #18 , but this PR is a companion to make the docker build succeed again

jurasan commented 1 year ago

Is FixedSparkDriver still needed? We had problem when we used this driver together with redshift driver on one metabase instance. Metabase connection worked, but Redshift couldn't connect anymore. After deleting FixedSparkDriver and just using com.databricks.client.jdbc.Driver everything works.

leopasta-enable commented 1 year ago

Is that working for everyone? I'm getting the following error when trying:

17 202.2 Step failed: Syntax error compiling at (metabase/driver/hive_like.clj:151:1).

17 202.4 {:via

17 202.4 [{:type clojure.lang.Compiler$CompilerException,

17 202.4 :message "Syntax error compiling at (metabase/driver/hive_like.clj:151:1).",

17 202.4 :data #:clojure.error{:phase :compile-syntax-check, :line 151, :column 1, :source "metabase/driver/hive_like.clj"},

17 202.4 :at [clojure.lang.Compiler analyze "Compiler.java" 6825]}

17 202.4 {:type java.lang.RuntimeException,

17 202.4 :message "No such var: sql.qp/field->identifier",

17 202.4 :at [clojure.lang.Util runtimeException "Util.java" 221]}],

zawlazaw commented 1 year ago

I ran into similar issues as @leopasta-enable here, and as @kRahul123 in https://github.com/relferreira/metabase-sparksql-databricks-driver/issues/21.

Afaics, Metabase changed their docker-build environment multiple times during the latest few versions, making it hard to provide a stable Dockerfile that works for multiple Metabase versions. Inspired from current Metabase master, I came up with the following approach to build metabase-sparksql-databricks-driver:master against Metabase:0.46.6 and later versions.

a) create an empty directory and create the following Dockerfile in it (heavily inspired by Metabase's own Dockerfile) :

FROM node:18-bullseye as builder

ARG MB_VERSION=v0.46.6

WORKDIR /home/node

RUN apt-get update && apt-get upgrade -y && apt-get install openjdk-11-jdk curl git -y \
    && curl -O https://download.clojure.org/install/linux-install-1.11.1.1262.sh \
    && chmod +x linux-install-1.11.1.1262.sh \
    && ./linux-install-1.11.1.1262.sh

WORKDIR /home

RUN git clone https://github.com/metabase/metabase metabase
RUN git clone https://github.com/relferreira/metabase-sparksql-databricks-driver.git driver

WORKDIR /home/metabase

RUN git checkout ${MB_VERSION}

ENV DRIVER_PATH=/home/driver
RUN clojure -Sdeps "{:aliases {:sparksql-databricks {:extra-deps {com.metabase/sparksql-databricks {:local/root \"$DRIVER_PATH\"}}}}}" -X:build:sparksql-databricks build-drivers.build-driver/build-driver! "{:driver :sparksql-databricks, :project-dir \"$DRIVER_PATH\", :target-dir \"$DRIVER_PATH/target\"}"

FROM scratch as export
COPY --from=builder /home/driver/target/sparksql-databricks.metabase-driver.jar .

b) within that directory, run the following bash command: DOCKER_BUILDKIT=1 docker build -t sparksql-databricks-metabase-driver-build --output . .

In the end, the directory contains the compiled file sparksql-databricks.metabase-driver.jar. No need to manually checkout any git repo or to install anything else than docker, all happens within that container.

bjgbeelen commented 1 year ago

@leopasta-enable This should have ideally combined with #18 which contains the required changes to make the code compile. This is not the most elegant process, I wanted to merge both PRs right after each other, but I can't merge my own PR :)

It should work if you merge master into this branch (i successfully build a docker image for this version)

bjgbeelen commented 1 year ago

@leopasta-enable if you create a PR with the exact same changes, I am able to merge the PR. Just verified again that when applying these changes to master that the build succeeds

or again a mention to @relferreira to merge this PR instead

leopasta-enable commented 1 year ago

It works now, thanks. It looks like I don't have permission to push a new branch to the repo, and I can't create a new PR on this branch (presumably because it would be identical to yours)