hasura / graphql-engine

Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.
https://hasura.io
Apache License 2.0
31.08k stars 2.77k forks source link

Stored metadata connection string overwritres environmental variable (v2) #7270

Open ro-savage opened 3 years ago

ro-savage commented 3 years ago

Version: 2.0.1 Database: Postgres 12.6 Version: Self-hosted

We are using a docker-image and we regularly change databases, and we regularly copy databases between postgres instances (pg_dump and pg_restore). With v1 we were using an env vars to change between databases.

However since v2, hasura will now use the environment variable to connect to a database, then read the metadata stored in hdb_metadata and proceed to update the database connection to what is stored in the meta data.

This causes hasura to be looking at the incorrect database. Worse for us, is because these databases are identical there is no way for us to know without manually checking the connection string.

We've had to write a script that will drop hdb_catalog after each pg_dump.

I am not sure of the best solution to this. But it seems very strange behaviour to use the connection string, connect to the database, read the metadata, and instantly connect to entirely different database (the logs also do not show this reconnection. It just shows the initial database connection)

tirumaraiselvan commented 3 years ago

Ideally, the metadata config for the database url should also reference an env var. You can do this by editing the database:

image

ro-savage commented 3 years ago

Thanks @tirumaraiselvan

I can confirm database URL was checked. We are using V2 of metadata.

We have never manually entered a database URL or connection params. On our live servers, we used HASURA_GRAPHQL_DATABASE_URL on 1.3.3 and then upgraded to 2.0.1. Did Hasura automatically change to database URL as the default and copy our HASURA_GRAPHQL_DATABASE_URL connection string to the metadata? (This would also appear to be potential security issue)

We also spin up new hasura docker instances automatically using an ENV_VAR and I can see all these have automatically been selected as Database URL. It appears to automatically copy our ENV_VAR to to the database URL metadata string and then select Database URL? This seems like incorrect behaviour and that it should default to HASURA_GRAPHQL_DATABASE_URL when hasura is first booted using the HASURA_GRAPHQL_DATABASE_URL

From reading the documentation, with V3 metadata we can specifically create a database.yaml with

database_url:
  from_env: HASURA_GRAPHQL_DATABASE_URL

We've now upgraded to V3, which should fix this issue for us. However, it feels like V2 Metadata with Hasura v2.0.0 should not automatically switch users to "Database URL" when they were using ENV VARS.

tirumaraiselvan commented 3 years ago

We have never manually entered a database URL or connection params. On our live servers, we used HASURA_GRAPHQL_DATABASE_URL on 1.3.3 and then upgraded to 2.0.1. Did Hasura automatically change to database URL as the default and copy our HASURA_GRAPHQL_DATABASE_URL connection string to the metadata? (This would also appear to be potential security issue)

It seems this could happen if you were using Heroku quick deploy guide to setup Hasura : https://hasura.io/docs/latest/graphql/core/deployment/deployment-guides/heroku.html#deploy-heroku

The issue is that Heroku adds the env variable DATABASE_URL to your application when you add the PG addon. Then, Hasura has to use the server flag --database-url $DATABASE_URL to get the connection string (see: https://github.com/hasura/graphql-engine-heroku/blob/master/Dockerfile) . If there was a way to set the env HASURA_GRAPHQL_DATABASE_URL during initialization then this wouldn't be needed.

Now, effectively, what Hasura sees is a static connection string being provided to it. Hence when it rewrites metadata for v2.0, it writes the static string. This is definitely surprising but unforunately, there is no way for Hasura to know that the value is coming from an env. We will add a step to our upgrade guide to notify Heroku users of this behaviour: https://hasura.io/docs/latest/graphql/core/guides/upgrade-hasura-v2.html#upgrade-hasura-v2 .

Thanks for bringing this to our attention!

tirumaraiselvan commented 3 years ago

Also, we are checking if there are any potential bugs if you are using HASURA_GRAPHQL_DATABASE_URL and not the startup flag --database-url $DATABASE_URL . Will update here.

ro-savage commented 3 years ago

Thanks for the investigation Tiru.

We don't link the PG addon in Heroku to our Hasura instances and we do not have a DATABASE_URL for our Hasura apps.

We instead control this via github actions during deployment. We simply add the HASURA_GRAPHQL_DATABASE_URL env var before starting the container, and update on each deployment as required.

Did the hdb_metadata field, along with the connection string exist in hasursa 1.x.x? My understand is it didn't, as we never encountered the issue when using Hasura 1.x.x

tirumaraiselvan commented 3 years ago

Hmm...I am not able to reproduce this locally. This is what I did:

Run v1.3.3 with HASURA_GRAPHQL_DATABASE_URL, Upgrade to v2.0.3, Export Metadata -> See "database_url": { "from_env": "HASURA_GRAPHQL_DATABASE_URL" } in the metadata.

Can you share the steps in your action so I can try to reproduce it?

This is my local docker-compose if you want to follow my steps:

version: '3.6'
services:
  postgres:
    image: postgres:12
    restart: always
    ports:
    - "6432:5432"
    environment:
      POSTGRES_PASSWORD: "password"
    volumes:
    - db_data:/var/lib/postgresql/data
  graphql-engine:
    image: hasura/graphql-engine:v1.3.3 #change it to v2.0.3 
    ports:
    - "8080:8080"
    depends_on:
    - "postgres"
    restart: always
    environment:
      HASURA_GRAPHQL_DATABASE_URL: postgres://postgres:password@postgres:5432/postgres
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true"
      HASURA_GRAPHQL_ENABLED_LOG_TYPES: "startup, http-log, websocket-log, livequery-poller-log, query-log"
      HASURA_GRAPHQL_CONSOLE_ASSETS_DIR: /srv/console-assets
ro-savage commented 3 years ago

@tirumaraiselvan - We don't have any v1.3.3 at the moment, since we upgraded a few weeks ago. If required I can checkout an old version.

However with metadata V2, and Hasura 2.0.3. Hasura is storing the connection string in the metadata automatically for us.

We drop the database and recreate it and it doesn't have hdb_catalog.

Then we start our docker image, and it creates hdb_catalog, including writing the string value database connection in into metadata rather than using the ENV VAR.

We have also tried v3 meta data (using from_env: HASURA_GRAPHQL_DATABASE_URL), and finding that it copies the ENV VAR into the connection string as well. So we are continuing to investigate why this might be occurring, if there is anything else that could be setting it.

Here is our docker-compose. No where in our v2 metadata files do we write anything about the database, or the database url.

version: "3.6"

services:
  postgres:
    image: circleci/postgres:12-alpine
    restart: always
    ports:
      - "23001:5432"
    environment:
      - POSTGRES_PASSWORD=somepass
    volumes:
      - pgdata:/var/lib/postgresql/data

  hasura:
    image: hasura/graphql-engine:v2.0.3.cli-migrations-v2
    ports:
      - "8080:8080"
    restart: always
    volumes:
      - ./services/hasura/metadata:/hasura-metadata
    env_file:
      - .env
    # Your .env file should contain:
    # External PostgreSQL:
    #   HASURA_GRAPHQL_DATABASE_URL=postgres://[username]@host.docker.internal:5432/runn_development
    # Docker PostgresQL:
    #   HASURA_GRAPHQL_DATABASE_URL=postgres://postgres:somepass@postgres:5432/runn_development
    environment:
      HASURA_ACTIONS_AUTHORIZATION_HEADER: "Bearer ${RUBY_ACTIONS_API_SECRET}"
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true"
      HASURA_GRAPHQL_DEV_MODE: "true"
      HASURA_GRAPHQL_ENABLED_LOG_TYPES: startup, http-log, webhook-log, websocket-log, query-log
      HASURA_GRAPHQL_ADMIN_SECRET: "somesecret"
      HASURA_GRAPHQL_JWT_SECRET: '{ "type": "HS256", "key": "${HASURA_JWT_SECRET}", "claims_format": "json" }'
      HASURA_GRAPHQL_ENABLE_TELEMETRY: "false"
    depends_on:
      - postgres
volumes:
  pgdata:
  tmpdata:
tirumaraiselvan commented 3 years ago

@ro-savage You are right. I can reproduce this with hasura/graphql-engine:v2.0.3.cli-migrations-v2 image.

Turns out this is because event the cli-migrations uses the --database-url way of starting Hasura: https://github.com/hasura/graphql-engine/blob/3089a385f188c43d252f2bd8c14f590dca635c06/scripts/cli-migrations/v2/docker-entrypoint.sh#L54

We will see what we can do here. Created issue: https://github.com/hasura/graphql-engine/issues/7319

ro-savage commented 3 years ago

@tirumaraiselvan - I believe this same issue also affects metadata v3. As we found after upgraded to V3 we still finding the database URL string inside our database hdb_metadata field, even though we specified from_env: HASURA_GRAPHQL_DATABASE_URL in the metadata yaml file.

ro-savage commented 3 years ago

Ahh this might be because we are still using hasura/graphql-engine:v2.0.3.cli-migrations-v2 instead of hasura/graphql-engine:v2.0.3.cli-migrations-v3 for our image. The person working on this is on leave, so we'll give the v3 image a try.

ro-savage commented 3 years ago

I can confirm that using hasura/graphql-engine:v2.0.3.cli-migrations-v3 with metadata v3 now correctly stores an environment variable name in hdb_metadata.

Using hasura/graphql-engine:v2.0.3.cli-migrations-v2 on both v2 and v3 metadata, will store the connection string including user/pass, that it copies from the env var.