MarquezProject / marquez

Collect, aggregate, and visualize a data ecosystem's metadata
https://marquezproject.ai
Apache License 2.0
1.72k stars 309 forks source link

Connection error PostgreSQL and Marquez #2468

Open TomEijk opened 1 year ago

TomEijk commented 1 year ago

Hi all!

I really appreciate this repository and would love to use it for its data lineage aspect! Last few days, I've been trying to get the marquez api Docker image up and running but whatever I try, it does not seem to work.

It returns the following errors in Docker Desktop:

2023-04-03 18:07:04 marquez-db | 2023-04-03 23:07:04.726 GMT [35] FATAL: password authentication failed for user "marquez" 2023-04-03 18:07:04 marquez-db | 2023-04-03 23:07:04.726 GMT [35] DETAIL: Role "marquez" does not exist.

and

ERROR [2023-04-03 22:44:31,217] org.apache.tomcat.jdbc.pool.ConnectionPool: Unable to create initial connections of pool. 2023-04-03 17:44:31 ! org.postgresql.util.PSQLException: FATAL: password authentication failed for user "marquez"

All the steps in the README.md file were used to configure the PostgreSQL database for Marquez and to set the correct environment variables. During my trouble shooting practices, I found a relatively similar issue on StackOverflow which refers to the danger in using the same ports for Postgres and Marquez but this did not help me yet in solving the issue (https://stackoverflow.com/questions/62115827/org-postgresql-util-psqlexception-fatal-password-authentication-failed-for-use).

Could you please help me out?

Kind regards,

Tom

boring-cyborg[bot] commented 1 year ago

Thanks for opening your first issue in the Marquez project! Please be sure to follow the issue template!

rossturk commented 1 year ago

That looks to me like the database volumes haven't been created.

I would recommend trying docker/up.sh --noweb to accomplish this. That script does some preparation tasks (which you can see inside docker/volumes.sh if you want to replicate them) and then calls docker-compose to bring everything up.

The prep steps you are following in README.md are, I believe, for running outside of Docker.

TomEijk commented 1 year ago

Thank you so much for your reply Ross!

Unfortunately, the logs in Docker Desktop still return the same error when I use docker/up.sh --no-web. I guess it has something to do with my PostgreSQL database. These are the steps how I try to get the Marquez API image running:

  1. Create an on-premise PostgreSQL connection with server marquez, database marquez, user marquez and port 5433 (to not have my PostgreSQL database and Marquez docker container on the same port)
  2. Build the project in my IDE via ./gradlew build in the marquez directory.
  3. Set the following environment variables in my IDE: POSTGRES_DB, POSTGRES_USER, and POSTGRES_PASSWORD, and POSTGRES_PORT (to 5433).
  4. Run the docker/up.sh --no-web in the marquez directory.

The strange thing here is that even when I use $Env:POSTGRES_USER='test', the Docker Desktop logs return this error: ERROR [2023-04-03 22:44:31,217] org.apache.tomcat.jdbc.pool.ConnectionPool: Unable to create initial connections of pool. 2023-04-03 17:44:31 ! org.postgresql.util.PSQLException: FATAL: password authentication failed for user "marquez"

So perhaps there is something hard coded? Changing the environment variables does not seem to have any effect.

Kind regards,

Tom

TomEijk commented 1 year ago

After some trial and error, I noticed the code does not completely run in Git Bash which means the error actually happens even further upstream. It returns:

the input device is not a TTY. If you are using mintty, try prefixing the command with 'winpty'.

I guess this means the volumes haven't been created indeed (so scripts like init-db.sh probably didn't run either which could explain the Marquez authentication error). Tried several things already to adjust the ./bashrc file but it still returns the error. Let me please know if anyone has a solution for this.

Small side note: The DB container runs fine on its own when using docker compose -f docker-compose.db.yml up

Kind regards,

Tom

genegc commented 1 year ago

I'm having the same issue. I'm on windows 11, docker 24.0.2 I've tried with marquez release 0.34.0 and 0.35.0 and I've tried doing a build. Here's an excerpt of the logging I'm seeing...

marquez-db | 2023-06-14 18:22:42.266 GMT [1] LOG: database system is ready to accept connections marquez-api | wait-for-it.sh: waiting 15 seconds for db:5432 marquez-db | 2023-06-14 18:22:43.438 GMT [34] LOG: incomplete startup packet marquez-api | wait-for-it.sh: db:5432 is available after 0 seconds marquez-api | WARNING 'MARQUEZ_CONFIG' not set, using development configuration. marquez-web | [HPM] Proxy created: /api/v1 -> http://api:5000/ marquez-web | App listening on port 3000! marquez-api | INFO [2023-06-14 18:22:45,500] org.eclipse.jetty.util.log: Logging initialized @2053ms to org.eclipse.jetty.util.log.Slf4jLog marquez-api | INFO [2023-06-14 18:22:45,610] io.dropwizard.server.DefaultServerFactory: Registering jersey handler with root path prefix: / marquez-api | INFO [2023-06-14 18:22:45,613] io.dropwizard.server.DefaultServerFactory: Registering admin handler with root path prefix: / marquez-api | INFO [2023-06-14 18:22:45,614] io.dropwizard.assets.AssetsBundle: Registering AssetBundle with name: graphql-playground for path /graphql-playground/* marquez-api | INFO [2023-06-14 18:22:45,636] marquez.MarquezApp: Running startup actions... marquez-api | INFO [2023-06-14 18:22:45,732] org.flywaydb.core.internal.license.VersionPrinter: Flyway Community Edition 8.5.13 by Redgate marquez-api | INFO [2023-06-14 18:22:45,732] org.flywaydb.core.internal.license.VersionPrinter: See what's new here: https://flywaydb.org/documentation/learnmore/releaseNotes#8.5.13 marquez-api | INFO [2023-06-14 18:22:45,732] org.flywaydb.core.internal.license.VersionPrinter: marquez-db | 2023-06-14 18:22:45.916 GMT [35] FATAL: password authentication failed for user "marquez" marquez-db | 2023-06-14 18:22:45.916 GMT [35] DETAIL: Role "marquez" does not exist. marquez-db | Connection matched pg_hba.conf line 95: "host all all all md5" marquez-api | ERROR [2023-06-14 18:22:45,932] org.apache.tomcat.jdbc.pool.ConnectionPool: Unable to create initial connections of pool.

NavneetSajwan commented 9 months ago

I am experiencing the same issue

schandir commented 1 week ago

@TomEijk @NavneetSajwan @genegc , I had similar issue and I fixed it by setting the default values for postgres server as localhost in the file 'marquez\marquez.dev.yml' image

Let me know if that works for you and I can raise a PR