citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.43k stars 662 forks source link

Warn against multi-database citus clusters #6062

Open CamiloHerreraDR opened 2 years ago

CamiloHerreraDR commented 2 years ago

I have been warned against creating multiple databases in Citus on the Citus Slack channel, and Azure doesnt allow multiple databases being created on the same Citus instance; however there is no mention of the possible implications of doing this on the Citus docs, or when distributing multiple databases on a coordinator.

There should be some safeguards against this depending on the possible consequences; we might have personally encountered several issues unexpectedly due to now knowing of this.

pdeaudney commented 2 years ago

Can you elaborate on some of the issues you've seen or think you've seen?

It seems to be a reasonable use case to distribute multiple databases inside a single cluster to me.

CamiloHerreraDR commented 2 years ago

image

image

image

image

This is an example of some of the issues I have had (Possibly) due to having multiple databases distributed on a postgresql instance, however I have not been able to dedicate time to create tests to replicate the issues extensively. I will attempt to allocate some time this weeknd to this.

Some might be caused due to creating the databases using Create Database <Copy_DB> with template <OriginalDB>

The use case is to create multiple environments on the same instances ,which can be used to allow for dev/qa/release_candidate environments to all use the same schemas without conflicts on the application layer.

I had been working on the citus Slack channel with several developers to troubleshoot some of the issues , and one of them mentioned that multiple databases are not supported currently ,and some of the issues might be caused due to this, however they did not elaborate and i dont know if its bad practice to name them or @ them. I dont know if its okay to just screenshot some of the discussions I have had previously.

ivyazmitinov commented 2 years ago

We have around 20 DATABASEs within our Citus cluster (on prem) and have never encountered any issues because of it, so the culprit is most likely the creation from another DB as template.

We have all our necessary extensions, except Citus, installed in the template1. But Citus is created and configured from scratch for every new DATABASE

CamiloHerreraDR commented 2 years ago

In that case I would suggest adding a warning against performing "create database with as template" to avoid creating those issues.

CamiloHerreraDR commented 2 years ago

This is another example of inconsistent behaviour, possibly caused due to using the template create database option :

image

CamiloHerreraDR commented 2 years ago

Another example, the database cant find one of the shards of a distributed table when dumping the database. image

CamiloHerreraDR commented 2 years ago

image

This is the output of the previous commands then logging remote commands

CamiloHerreraDR commented 2 years ago

image

The consequence of using a select * from the table previously throwing errors in schema ota when making a dump.

CamiloHerreraDR commented 2 years ago

image However using the drop table command does work, i assume its becuase it internally uses the command drop table if exists.