citusdata / docker

:ship: Docker images and configuration for Citus
Apache License 2.0
243 stars 100 forks source link

Support for multiple databases #41

Open marcocitus opened 7 years ago

marcocitus commented 7 years ago

Moved from https://github.com/citusdata/citus/issues/1680 :

Can we add support for multiple databases on the same citus nodes?

Rather than having the entrypoint read the .sql files sequentially under a monolithic DB, if we do it by directory then we could do something like this:

ADD sql/* /docker-entrypoint-initdb.d/

and the entrypoint reads sql/bob/000-create-citus-extension.sql and creates it for the bob database and so on. It seems cleaner to me. I'm not sure how this extends to the manager code which appears to be hardcoded to the postgres user and only reads a single POSTGRES_DB env on init at present.

(Originally submitted by @kbaegis)

jasonmp85 commented 7 years ago

Hey @kbaegis; this configuration is already supported, but as you've noticed it's not as straightforward as just using the environment variables and sql files.

I've seen several customers put the CREATE DATABASE statements in one sql file and add a subsequent sh file with a little shell script for loop that loops over all their databases and calls psql --username="$PGUSER" --dbname="$DB" "CREATE EXTENSION IF NOT EXISTS citus;", or something like that.

The membership manager stuff is also written with a single database in mind, so you'd want to run one membership manager for each database (just copy-paste the docker-compose section for it and supply the appropriate POSTGRES_DB value to each one).

We're still trying to feel out exactly what use-cases our Docker images should serve: can you share a bit about where you're using multiple DATABASEs? In the past, most such usages have been for local Rails development: the devs want to run a test and development cluster for their app and have them be the same docker-compose, just with different database parts of the URL. Is your use-case similar, or entirely different?

kbaegis commented 7 years ago

Sure. I run multiple self-hosted applications across a 4 node cluster. Citus is filling my need for a multi-master scalable database backend for these applications (gogs, nextcloud, radius, etc). I suppose technically the controller is active/standby with streaming replication, but that's a minor issue. To provide secure isolation between the applications, each is tied to a user/db/auth mechanism per pg_hba.conf to prevent cross-application queries against the db. There's no technical table name collisions or anything, however that's a standard practice.

kbaegis commented 7 years ago

As I play around with this, some applications really dislike not being able to implement UNIQUE. If you guys come up with a strategy for multiple database containers, it would be wonderful if 1) the membership manager could track multiple databases, and 2) you could conveniently differentiate between citus tracked/enabled databases and another MM implementation and/or streaming master/slave replication. Any recommendations on a global lock/transaction manager for postgres MM? The BDR 2.0 project almost tricked me into thinking it was open source. ;)

dbaq commented 6 years ago

I have a need for multiple DB as well, we figured it out with someone on the slack channel:

Here is the working set up for 2 databases (accounts+domains)

docker-compose file:


services:
  master:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_master"
    build: "./"
    ports: ["${MASTER_EXTERNAL_PORT:-5432}:5432"]
    labels: ['com.citusdata.role=Master']
  worker:
    build: "./"
    labels: ['com.citusdata.role=Worker']
    depends_on: { domains_manager: { condition: service_healthy }, accounts_manager: { condition: service_healthy } }
  domains_manager:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_domains_manager"
    image: 'citusdata/membership-manager:0.2.0'
    volumes: ['/var/run/docker.sock:/var/run/docker.sock']
    depends_on: { master: { condition: service_healthy } }
    environment:
      POSTGRES_DB: "domains"
  accounts_manager:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_accounts_manager"
    image: 'citusdata/membership-manager:0.2.0'
    volumes: ['/var/run/docker.sock:/var/run/docker.sock']
    depends_on: { master: { condition: service_healthy } }
    environment:
      POSTGRES_DB: "accounts"

DockerFile

FROM citusdata/citus:7.0.2-alpine
COPY 002-create-other-databases.sql /docker-entrypoint-initdb.d/

002-create-other-databases.sql file

CREATE DATABASE domains;
\c domains;
CREATE EXTENSION citus;
CREATE DATABASE accounts;
\c accounts;
CREATE EXTENSION citus;

The key is to have 1 manager per database. Hope it helps other people.

jasonmp85 commented 6 years ago

@dbaq see the above comment, https://github.com/citusdata/docker/issues/41#issuecomment-333606545

jasonmp85 commented 6 years ago

(Which is to say I believe you independently arrived at the same solution)

dbaq commented 6 years ago

@jasonmp85 yes, I was just posting a working example with 2 databases (since I struggled a bit, that might help someone else).

maziar-dandc commented 2 years ago

@jasonmp85 @marcocitus

Hi guys, we had to create a worker for the second database as well as the manager which had been mentioned here, without the worker it would give the below error:

django.db.utils.NotSupportedError: cannot create foreign key constraint since foreign keys from reference tables and local tables to distributed tables are not supported
DETAIL:  Reference tables and local tables can only have foreign keys to reference tables and local tables

This error was raised on create_distributed_table('tenant_table', 'id'); with a couple reference tables being created before it, without the reference tables being created, the create_distributed_table would go through but then the reference tables creation would fail with:

When adding a foreign key from a local table to a reference table, Citus applies a conversion to all the local tables in the foreign key graph

I cannot get our project to run on standalone docker version at all, because of the above error, it only runs on clustered version which is OK but I want to make sure I'm not messing anything up and its somehow a limitation of the standalone version that doesn't maybe fully support all of citus functionalities?

maziar-dandc commented 2 years ago

Error reproduced at https://github.com/MaziyarMK/citus-bug