citusdata / pg_shard

ATTENTION: pg_shard is superseded by Citus, its more powerful replacement
https://github.com/citusdata/citus
GNU Lesser General Public License v3.0
1.06k stars 63 forks source link

Support sharding by schema name #99

Closed bradrobertson closed 1 year ago

bradrobertson commented 9 years ago

I've got a rubygem call Apartment that does database multi-tenancy for ActiveRecord using postgresql schemas. Effectively, our whole table structure is mirrored on a per-schema basis for each customer that we add to the system.

It'd be nice if we could shard based on the postgresql schema, such that the app itself wouldn't need to know about physical shards under the hood and queries containing a particular schema in the search path would automatically be routed to the correct physical machine.

The basic setup is that public schema stores excluded models which are non tenanted models. Then, for each tenant, we create a new schema and create all the tables etc in that schema. I guess this would involve triggers on things like CREATE SCHEMA x to actually create it on the correct shard, and then of course on queries themselves to route to that schema.

As discussed at PgConf, I'm not well versed on the internals of pg_shard itself but I'm quite happy to get the discussion going around the Apartment design and using schemas from a sharding concern.

trevvvy commented 9 years ago

+1

obozek commented 9 years ago

+1

Frank004 commented 9 years ago

+1

jasonmp85 commented 9 years ago

:grin: ok, ok, we definitely see the +1 comments here. I'll chat with the team about where this is on our roadmap, since it seems to be a popular request.

Verurteilt commented 7 years ago

Any update on this, it would be awesome, and the company i work for would be willing to test this feature :D

craigkerstiens commented 7 years ago

Hi @Verurteilt there's a few things in here. One we've deprecated pg_shard in favor of Citus. Though we also aren't supporting sharding by schema in Citus in the near term. The issue with that is that sharding by schema doesn't scale well to a very large scale. At somewhere north of 1,000 schemas Postgres just stops behaving very well. One of the popular sharding libraries for Ruby has actually blogged about the experience and some of the pains of sharding by schema - https://influitive.io/our-multi-tenancy-journey-with-postgres-schemas-and-apartment-6ecda151a21f#.4pf9zn1pw

We have made lots of good progress though in the experience of sharding within your application including having our own Rails gem that makes this much easier. If you'd like to discuss a bit further would be happy to if you want to drop us an email.

arpanpreneur commented 2 years ago

My app uses the exact same architecture described in this issue. Mine is written in Django and uses django-tenant-schemas library. Beautifully works on single node, but looking for a way to distribute the schemas across nodes.

NicoRyberg commented 1 year ago

Hello guys, have any updates on this? Also using django-tenants and started with a multitenancy architecture to support 10-30 tenants that is scaling up to thousands. We need to make an architecture decision in this matter. @ArpanKIIT2017 how did you solve this issue?

marcocitus commented 1 year ago

Fixed in Citus 12.0

arpanpreneur commented 1 year ago

@NicoRyberg Since ours was not a self-serve app, we just created multiple environments (multiple instances of our app on different subdomains) each with their own dedicated PostgreSQL DB instances. When onboarding customers, we allocate them an environment based on the estimated data size required from the customer in the next 2 years.

[ArpanKIIT2017] - old username