airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.55k stars 4.01k forks source link

Destination Postgres: Problem connecting postgres database in docker #33971

Closed augcollet closed 8 months ago

augcollet commented 8 months ago

Topic

Postgres docker connector

Relevant information

Hello !

I'm facing some issues when trying to use postgres connector between Docker Airbyte and Docker Postgres Database, and I dont really understand my observations.

First I've created a specific network :

docker network create custom-network

I've created a docker postgres container with this docker-compose.

version: "3.9"
services:

  postgres:
    image: postgres-db/postgres
    restart: always
    environment:
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_DB=${POSTGRES_DB}
    ports:
      - ${POSTGRES_PORT}:5432
    volumes: 
      - ./volumes/data:/var/lib/postgresql/data

networks: 
  default: 
    external: 
      name: custom-network

The database is working and I can connect to postgres database from dbeaver.

Then I'm installing Airbyte successfully with the command provided, and I add each container to my custom network.

(cd airbyte && ./run-ab-platform.sh -b)
for container_id in $(docker ps --format '{{.Names}}' | grep 'airbyte'); do
    echo "Add to platform network $container_id"
    docker network connect custom-network $container_id
done

Now, when I try to create postgres connector using "postgres" as hostname (corresponding to container name) I get the following : image

Internal message: io.airbyte.commons.exceptions.ConnectionErrorException: java.sql.SQLTransientConnectionException: 
HikariPool-1 - Connection is not available, request timed out after 10002ms.
Failure origin: destination
Failure type: config_error
    at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
    at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:223)
    at org.postgresql.Driver.makeConnection(Driver.java:402)
    at org.postgresql.Driver.connect(Driver.java:261)
    at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138)
    at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:359)
    at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:201)
    at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:470)
    at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:733)
    at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:712)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.UnknownHostException: postgres
    at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:567)
    at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
    at java.base/java.net.Socket.connect(Socket.java:633)
    at org.postgresql.core.PGStream.createSocket(PGStream.java:241)
    at org.postgresql.core.PGStream.<init>(PGStream.java:98)
    at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:109)
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:235)
    ... 14 more
2024-01-03 20:49:14 platform > Docker volume job log path: /tmp/workspace/5143ad9e-0d2d-4469-b42e-9d0370663796/0/logs.log
2024-01-03 20:49:14 platform > Executing worker wrapper. Airbyte version: 0.50.40
2024-01-03 20:49:14 platform > Attempt 0 to save workflow id for cancellation
2024-01-03 20:49:14 platform > 
2024-01-03 20:49:14 platform > ----- START check-orchestrator -----
2024-01-03 20:49:14 platform > Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2024-01-03 20:49:14 platform > 
2024-01-03 20:49:14 platform > Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2024-01-03 20:49:14 platform > Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2024-01-03 20:49:14 platform > Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2024-01-03 20:49:14 platform > Using default value for environment variable LAUNCHDARKLY_KEY: ''
2024-01-03 20:49:14 platform > Checking if airbyte/destination-postgres:0.4.0 exists...
2024-01-03 20:49:14 platform > airbyte/destination-postgres:0.4.0 was found locally.
2024-01-03 20:49:14 platform > Creating docker container = destination-postgres-check-5143ad9e-0d2d-4469-b42e-9d0370663796-0-nsvog with resources io.airbyte.config.ResourceRequirements@6fac8953[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts null
2024-01-03 20:49:14 platform > Preparing command: docker run --rm --init -i -w /data/5143ad9e-0d2d-4469-b42e-9d0370663796/0 --log-driver none --name destination-postgres-check-5143ad9e-0d2d-4469-b42e-9d0370663796-0-nsvog --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=airbyte/destination-postgres:0.4.0 -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e FIELD_SELECTION_WORKSPACES= -e USE_STREAM_CAPABLE_STATE=true -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE= -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=0 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT=config -e AIRBYTE_VERSION=0.50.40 -e WORKER_JOB_ID=5143ad9e-0d2d-4469-b42e-9d0370663796 airbyte/destination-postgres:0.4.0 check --config source_config.json
2024-01-03 20:49:14 platform > Reading messages from protocol version 0.2.0
2024-01-03 20:49:17 platform > INFO i.a.i.d.p.PostgresDestination(main):98 starting destination: class io.airbyte.integrations.destination.postgres.PostgresDestination
2024-01-03 20:49:17 platform > INFO i.a.i.b.IntegrationCliParser(parseOptions):126 integration args: {check=null, config=source_config.json}
2024-01-03 20:49:17 platform > INFO i.a.i.b.IntegrationRunner(runInternal):106 Running integration: io.airbyte.integrations.base.ssh.SshWrappedDestination
2024-01-03 20:49:17 platform > INFO i.a.i.b.IntegrationRunner(runInternal):107 Command: CHECK
2024-01-03 20:49:17 platform > INFO i.a.i.b.IntegrationRunner(runInternal):108 Integration config: IntegrationConfig{command=CHECK, configPath='source_config.json', catalogPath='null', statePath='null'}
2024-01-03 20:49:17 platform > WARN c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword

However, "postgres" hostname is recognized from any Airbyte container : image ping is not installed, but I think that the command show that networking is ok.

The strangest thing is that it works when I use 'localhost' instead of 'postgres' as host : image

This contradicts my understanding of docker. In principle, in this scenario, "localhost" should refer to the container itself and that should not work.

Could you help me understand this behavior? And to find a clean solution so that the connector can work using the 'postgres' host?

I thank you in advance !

augcollet commented 8 months ago

To add some informations, it's working when I use the IP address of the container :

docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' postgres_postgres_1

image