airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
14.76k stars 3.79k forks source link

Add user management and login screen #3583

Open bashyroger opened 3 years ago

bashyroger commented 3 years ago

Tell us about the problem you're trying to solve

We have installed Airbyte in our own data center but want to make it available via the public internet. But Airbyte does not have user management functionality / a login screen to prevent unauthorized access. This means we have to do extra solution config work to prevent that kind of access.

Describe the solution you’d like

Describe the alternative you’ve considered or used

Using Airbyte via a VPN, reverse proxy or SSH all involve more config work on a feature that should be there in a (self-hosted) SAAS tool

┆Issue is synchronized with this Asana task by Unito

raphaeltm commented 2 years ago

I just started poking around Airbyte out of curiosity, and while most of what I saw was awesome, this is something I found quite surprising. Especially since the "deploying Airbyte" instructions explain how to get Airbyte up and running on machines that have public IPs (at least in the case of Digital Ocean) but then don't mention anything like "hey, you'll want to make sure to do XYZ to make sure you aren't leaking data to the public because there is no auth system, and this UI is currently publicly accessible."

In DigitalOcean's case, it's pretty straightforward to use their Cloud Firewall product to prevent public access, but it would be nice to see limited access being the default. Even just a basic secret key system, where a key is defined in the .env file would be a good starting point. But I think a full-fledged auth system would be ideal, if for no other reason than audit logs would be nice to have.

shey commented 2 years ago

@raphaeltm. a few months ago I had the same problem. I needed to secure an airbyte instance.

I ended up using oauth2-proxy. It's pretty cool. It's a reverse proxy you can put in front of a service, once you do, accessing that service will first require you to login with your organization's Google account.

raphaeltm commented 2 years ago

@shey, thanks for the heads up! That's a really cool project there, which I'll definitely be keeping in mind for the future.

I decided to go with a SaaS offering instead (mainly because AirByte doesn't yet support one of the services we use, and we don't have the capacity to build the connector ourselves right now).

But in the meantime, the solution I was eyeing (just because I have experience with it) is to use CloudFlare's Argo Tunnel to expose AirByte, and then CloudFlare's Access product to limit access to specific people.

jrhizor commented 2 years ago

Argo tunnel is a good option. We've also seen success with Google IAP / similar offerings that put an auth layer in front of APIs.

@raphaeltm just curious, what connector were you hoping for? Even if you're planning on going a different direction, we'd still appreciate a request on https://airbyte.com/connector-requests

raphaeltm commented 2 years ago

Hey @jrhizor I've already gone ahead and upvoted the Xero connector 😁

I also ran into an issue with Asana where it seems it isn't possible to do any filtering up front? The SaaS alternative we're testing allows us to select projects we want to sync. Though neither does what we actually want which is to filter by organization so that we're not syncing data from partners' Asana organizations which we've been invited to, which would be a big no-no 😕

joepbuhre commented 2 years ago

I also would really love a user management feature. Because this is primarily for businesses it would be even better if there was an oath2 provider such as Google & Azure. But possible a JWT implementation is also nice to have.

engmsaleh commented 2 years ago

Hi @shey, Could you share your setup with oauth2-proxy? Do you have one docker-compose setup or a K8s one?

tinomerl commented 2 years ago

Hey @engmsaleh, we ended up with a similiar setup as shey but we used azure active directory. We have a docker-compose setup. Here are the parts i added to the airbyte docker-compose.yml.

cookie_storage:
    image: redis:latest
    expose:
      - 6379
    restart: always
  auth:
    image: bitnami/oauth2-proxy
    env_file: 
      auth.env
    ports:
      - 8000:4180
    command: oauth2_proxy --http-address=0.0.0.0:4180 --reverse-proxy=true --skip-provider-button --session-store-type=redis --redis-connection-url=redis://cookie_storage:6379/1 --upstream=http://webapp:80 --provider=azure --redirect-url=<subdomain.for.airbyte.com>/oauth2/callback --email-domain=<@azure-ad-email-domain> --whitelist-domain=localhost --whitelist-domain=<subdomain.for.airbyte.com> --scope="profile User.Read" --cookie-secure=true --cookie-domain=<subdomain.for.airbyte.com>
    depends_on: 
      - cookie_storage
      - webapp
    restart: always

In the auth.env file we have the following environment variables.

OAUTH2_PROXY_COOKIE_SECRET="somerandomstring12341234567890AB"
OAUTH2_PROXY_AZURE_TENANT="Azure-Tenant-ID"
OAUTH2_PROXY_CLIENT_ID="Azure-APP-Client-ID"
OAUTH2_PROXY_CLIENT_SECRET="Azure-APP-Client-Secret"

We also deleted the portbinding of the webapp container to port 8000 and gave only certain users access to the azure app. Also we created a subdomain specific for airbyte on our server to route the traffic.

Hope that helps.

engmsaleh commented 2 years ago

Hi @tinomerl I really appreciate your shared info & @shey for the initial suggestions I was able to make the following setup

Nginx "not in docker-compose" ---> Oauth2-proxy "in docker-compose" ---> webapp

Nginx Config

server {
  listen      80;
  listen [::]:80;
  server_name internal.domain.com;

  location / {
                proxy_pass http://localhost:4180;
                proxy_http_version 1.1;
                proxy_set_header Upgrade $http_upgrade;
                proxy_set_header Connection 'upgrade';
                proxy_set_header Host $host;
                proxy_cache_bypass $http_upgrade;
  }

and used Certbot for SSL certificate for my domain

I edit docker-compose.yaml as follows

  webapp:
    image: airbyte/webapp:${VERSION}
    logging: *default-logging
    container_name: airbyte-webapp
    restart: unless-stopped
    environment:
      - AIRBYTE_ROLE=${AIRBYTE_ROLE:-}
      - AIRBYTE_VERSION=${VERSION}
      - API_URL=${API_URL:-}
      - IS_DEMO=${IS_DEMO:-}
      - FULLSTORY=${FULLSTORY:-}
      - TRACKING_STRATEGY=${TRACKING_STRATEGY}
      - INTERNAL_API_HOST=${INTERNAL_API_HOST}
      - OPENREPLAY=${OPENREPLAY:-}
      - PAPERCUPS_STORYTIME=${PAPERCUPS_STORYTIME:-}
  oauth2proxy:
    image: "bitnami/oauth2-proxy:latest"
    container_name: airbyte-oauth2proxy
    ports:
    - "4180:4180"
    command: [
      "--provider=google",
      "--cookie-secure=false",
      "--upstream=http://webapp:80",
      "--http-address=0.0.0.0:4180",
      "--skip-auth-regex=/forms/*",
      "--redirect-url=https://internal.domain.com/oauth2/callback",
      "--email-domain=domain.com"
    ]
    env_file:
      - .env
    depends_on: 
      - webapp

Adding the following key/value pairs into .env

OAUTH2_PROXY_CLIENT_ID=YOUR_AOUTH_CLIENT_ID.apps.googleusercontent.com
OAUTH2_PROXY_CLIENT_SECRET=YOUR_AOUTH_CLIENT_SECRET

@tinomerl I didn't get the advantage of cookies manager, so I didn't put it, Could you explain its value in the setup?

tinomerl commented 2 years ago

@engmsaleh looks good to me. Using redis as a cookie storage saves the session information in the redis database instead of saving the whole cookie client side. it's more secure for handling the tokens and refreshing them. also the transferred data shrinks when authenticating, since all session data is transferred on every request. The official docs have a great comparison between the two ways of handling sessions.

CarlosACQ commented 2 years ago

Hi @tinomerl I really appreciate your shared info & @shey for the initial suggestions I was able to make the following setup

Nginx "not in docker-compose" ---> Oauth2-proxy "in docker-compose" ---> webapp

Nginx Config

server {
  listen      80;
  listen [::]:80;
  server_name internal.domain.com;

  location / {
                proxy_pass http://localhost:4180;
                proxy_http_version 1.1;
                proxy_set_header Upgrade $http_upgrade;
                proxy_set_header Connection 'upgrade';
                proxy_set_header Host $host;
                proxy_cache_bypass $http_upgrade;
  }

and used Certbot for SSL certificate for my domain

I edit docker-compose.yaml as follows

  • I have removed the port 8000 mapping from webapp to make it internal
  • Added aouth2-proxy to docker-compose
  webapp:
    image: airbyte/webapp:${VERSION}
    logging: *default-logging
    container_name: airbyte-webapp
    restart: unless-stopped
    environment:
      - AIRBYTE_ROLE=${AIRBYTE_ROLE:-}
      - AIRBYTE_VERSION=${VERSION}
      - API_URL=${API_URL:-}
      - IS_DEMO=${IS_DEMO:-}
      - FULLSTORY=${FULLSTORY:-}
      - TRACKING_STRATEGY=${TRACKING_STRATEGY}
      - INTERNAL_API_HOST=${INTERNAL_API_HOST}
      - OPENREPLAY=${OPENREPLAY:-}
      - PAPERCUPS_STORYTIME=${PAPERCUPS_STORYTIME:-}
  oauth2proxy:
    image: "bitnami/oauth2-proxy:latest"
    container_name: airbyte-oauth2proxy
    ports:
    - "4180:4180"
    command: [
      "--provider=google",
      "--cookie-secure=false",
      "--upstream=http://webapp:80",
      "--http-address=0.0.0.0:4180",
      "--skip-auth-regex=/forms/*",
      "--redirect-url=https://internal.domain.com/oauth2/callback",
      "--email-domain=domain.com"
    ]
    env_file:
      - .env
    depends_on: 
      - webapp

Adding the following key/value pairs into .env

OAUTH2_PROXY_CLIENT_ID=YOUR_AOUTH_CLIENT_ID.apps.googleusercontent.com
OAUTH2_PROXY_CLIENT_SECRET=YOUR_AOUTH_CLIENT_SECRET

@tinomerl I didn't get the advantage of cookies manager, so I didn't put it, Could you explain its value in the setup?

Can you make a video please.

engmsaleh commented 2 years ago

Hi @CarlosACQ I may write a tutorial later, it depends on your setup, What is your current setup?

CarlosACQ commented 2 years ago

Hi @CarlosACQ I may write a tutorial later, it depends on your setup, What is your current setup?

Hi, Ubuntu 20 on a VPS, and docker.

CarlosACQ commented 2 years ago

HI!. Ubuntu 20(VPS) and docker

On Fri, Feb 18, 2022 at 3:59 PM Mohamed Saleh Zaied < @.***> wrote:

Hi @CarlosACQ https://github.com/CarlosACQ I may write a tutorial later, it depends on your setup, What is your current setup?

— Reply to this email directly, view it on GitHub https://github.com/airbytehq/airbyte/issues/3583#issuecomment-1045104208, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA62U3KISXJZQYCJD6BKGCLU32QKNANCNFSM45O2K2DQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- Atte. Carlos A. Cabrera Quiroga

afonsoaugusto commented 2 years ago

Hi, about this feature (user management and login screen), we have a plan to add or is present in roadmap? Thank's

shey commented 2 years ago

@CarlosACQ I wrote a brief tutorial on using setting up oauth2-proxy with nginx. Hope it's helpful.

jzhang-georgian commented 2 years ago

For folks who want to use IAP to secure Airbyte, try this tutorial .