acryldata / datahub-actions

DataHub Actions is a framework for responding to changes to your DataHub Metadata Graph in real time.
42 stars 47 forks source link

Ingestion Executor - Failed to configure the source (mssql): No module named 'pyodbc' #92

Closed waqassiddiqi closed 1 year ago

waqassiddiqi commented 1 year ago

I am trying to configure MSSQL server source via UI but it fails with Failed to configure the source (mssql): No module named 'pyodbc' suggesting that the required pip dependency is missing - I even tried extending acryldata/datahub-actions:v0.0.11 docker image but it didn't work either - i believe because it creates venv based on predefined requirements.

Is there anyway to specify additional dependencies that needs to installed (pyodbc in this instance) to ingest data from UI, any help / direction is highly appreciated?

waqassiddiqi commented 1 year ago

For those looking for a solution, found a solution thanks to community member on slack;

  1. Modify /usr/local/bin/ingestion_common.sh file by adding --system-site-packages flag when venv is being created i.e on line 36: python3 -m venv --system-site-packages $venv_dir
  2. Use acryldata/datahub-actions base image to create an image with pyodbc and other required dependencies installed

The Dockerfile i used:

FROM acryldata/datahub-actions:v0.0.11

USER root

RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/debian/11/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get update
RUN ACCEPT_EULA=Y apt-get install -y msodbcsql17 unixodbc-dev tdsodbc
RUN pip install pyodbc

COPY ingestion_common.sh /usr/local/bin

USER datahub
gesundes commented 1 year ago

@waqassiddiqi Why this issue was closed? It wasn't fixed at the project level. I faced with this issue and some other guys too, I think. Is it possible to reopen the issue?

hsheth2 commented 1 year ago

@gesundes if you set your source type to mssql-odbc, does that work?