airbytehq / write-for-the-community

Contribute and collaborate on educational content for the Airbyte Community.
MIT License
41 stars 8 forks source link

Set up Google Cloud Platform Postgres as a source in Airbyte #157

Open RoseateSpoonbill opened 2 years ago

RoseateSpoonbill commented 2 years ago

How can I connect to a GCP Postgres database as a Airbyte source?

  1. What set up steps do I need to take?
  2. What values do I use for the new Airbyte source configuration? (and where do I get them?)

I specifically need this information for these

  1. Airbyte Deployment locations
    1. Local Airbyte deployment
    2. GCP Compute Engine
    3. On Kubernetes
  2. Methods of authorizing to GCP Postgres
    1. Cloud SQL Auth proxy
    2. Self-managed SSL/TLS certificates
    3. Authorized networks

It would be great if https://github.com/airbytehq/airbyte/blob/master/docs/integrations/sources/postgres.md could be updated with this information

Stretch

For the tutorial to be most helpful for the widest audience, it would probably be best if the tutorial showed this information for all of the other places Airbyte can be deployed to

  1. Airbyte Cloud
  2. On AWS (EC2)
  3. On AWS ECS
  4. On Azure(VM)
  5. On Plural
  6. On Oracle Cloud Infrastructure VM
  7. On DigitalOcean (Droplet)
RoseateSpoonbill commented 2 years ago

I ended up not being able to figure out how to do this locally.

⚠️ THIS DOES NOT WORK ⚠️

This was as far as I got but I couldn't figure out either the right host name or what I was doing wrong in the addition to docker-compose.yaml

  1. In your airbyte repository, open your docker-compose.yaml file and add this at the bottom of the services top level section
      proxy:
        container_name: cloud-sql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.30.0
        volumes:
          - ..\{{FULL_PATH_TO_SERVICE_ACCOUNT_JSON_FILE}}:/config
        ports:
          - 127.0.0.1:5436:5436 
        command: "/cloud_sql_proxy -instances={{INSTANCE_CONNECTION_NAME}}=tcp:5436 -credential_file=/config/{{SERVICE_ACCOUNT_JSON_FILE_NAME}}"
  2. Follow the instructions in the Start Airbyte section above
  3. Once Airbyte loads, in the lefthand menu click Sources
  4. Click + add new source
    1. Choose Postgres from the dropdown menu
    2. Host: :question:
    3. Port: 5436
    4. Fill in other fields
    5. Replication Method: Standard
    6. SSH Tunnel Method: No Tunnel
    7. Click Set up source

⚠️ This also did not work ⚠️ Run the Cloud SQL Auth proxy in a Docker container (per https://cloud.google.com/sql/docs/postgres/connect-admin-proxy#connecting-docker) as I couldn't figure out

  1. Which Airbyte Docker container needed to connect to the Proxy Docker container
  2. How to connect them
RoseateSpoonbill commented 2 years ago

I did get the combo of GCP Compute Engine and Cloud SQL Auth Proxy working and have submitted a PR to update https://docs.airbyte.com/deploying-airbyte/on-gcp-compute-engine for that use case: https://github.com/airbytehq/airbyte/pull/12086