tokern / data-lineage

Generate and Visualize Data Lineage from query history
https://tokern.io/data-lineage/
MIT License
310 stars 45 forks source link

tokern_worker container keep restarting with error: /docker-entrypoint.sh: 11: exec: rq: not found #109

Open kevindany opened 1 year ago

kevindany commented 1 year ago

Steps to reproduce:

  1. wget https://raw.githubusercontent.com/tokern/data-lineage/master/install-manifests/docker-compose/tokern-lineage-engine.yml
  2. configure to use an external Postgres database, change the following parameters in tokern-lineage-engine.yml: CATALOG_HOST CATALOG_USER CATALOG_PASSWORD CATALOG_DB
  3. docker-compose -f tokern-lineage-engine.yml up -d
  4. run: docker ps (to list the status)
  5. ru: docker logs -f tokern_worker (to get the logs)
image
ignaski commented 1 year ago

@kevindany were you able to solve this?

hendrix04 commented 1 year ago

I am on a Mac M1 and I am running into this same problem. It's also worth pointing out that the python docker that they are using as a base has 21 critical vulnerabilities.

hendrix04 commented 1 year ago

Looking a bit more into this, I can confirm that for some reason rq and redis packages are not installed in the python environment for whatever reason.

hendrix04 commented 1 year ago

Looking a bit more into this, I think that I have found the issue (though I haven't been able to successfully build my own docker container).

It seems that Poetry has changed the way in which it should be installed and https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py now returns a 404. It is very likely that nothing is getting installed via Poetry and I would be shocked if any of the containers appropriately work.

The new URL is https://install.python-poetry.org.

hendrix04 commented 1 year ago

I created a fix for this at https://github.com/hendrix04/data-lineage/tree/poetry_fix. I have tested that all containers launch and don't go into a restart loop, but haven't tested that functionality actually works.