Closed jasonmp85 closed 9 years ago
I'm aware that this work in progress, but some early feedback. We were trying this out today to see if it would help make the CitusDB -> pg_shard migration path easier for CitusDB documentation purposes, since that seems to be the more common direction. We created a table using CitusDB with DISTRIBUTE BY APPEND and used this branch of pg_shard.
One problem we ran into is that when doing an INSERT in an non-existing range we got "ERROR: cannot execute INSERT on a distributed table on master node", because pg_shard ExecutorStart executor hook falls back to CitusDB for zero-shard queries. This led to some confusion.
The next problem we ran into is that pg_shard caches the metadata. Any change made to the metadata by CitusDB is not visible to pg_shard until a new session is started, this also led to some confusion. For example, performing \STAGE and then INSERT into the range would give "ERROR: no placements exist for shard with ID".
Finally, we added a new shard using \STAGE, the range of which overlapped with an existing the shard. We then tried an INSERT on the overlapping range. It went to the first shard. We then started a new session to clear the cache, after which we got "ERROR: cannot modify multiple shards during a single query". This was expected, but I guess an INSERT should always go to at most one shard.
It seems the more fundamental issue here is the caching. We might have to make changes to CitusDB to help pg_shard clear its cache or set up a trigger on the catalog table.
Thanks @marcocitus… there was no intention of pg_shard
honoring DISTRIBUTE BY APPEND
at the moment anyways, as it should only understand hash partitioning. I'll work to ensure it understands there are tables that are distributed but which it should not touch.
As far as caching goes… bleh. I forgot we had added caching. pg_shard
doesn't expect things to change underneath it. It doesn't cache empty results, so if no shards exist you shouldn't see caching. But once shards exist we currently expect them to be long-lived (placements, on the other hand, can change state, etc.)
Do we anticipate having \STAGE
work with pg_shard
anytime soon? pg_shard
assumes partitions are that: partitions.
I had a quick clarification question.
With these changes, which migration scenario do we intend to handle? (a) pg_shard -> CitusDB, (b) CitusDB -> pg_shard (shard rebalancer?), or (c) CitusDB <-> pg_shard
CitusDB currently observes state changes when we \stage to a new shard (append partitioned), append to an existing shard, or rebalance a table. pg_shard observes state changes when we create shards for a new table, or fail to write to a shard placement (hash partitioned).
Two other questions -- don't know, just asking. Who has the authoritative metadata, CitusDB or pg_shard with these changes? Under what conditions does the authority change?
Because this PR actually causes pg_shard
to write through to CitusDB when CitusDB is present, there is only ever one actual metadata store. When using CitusDB, CitusDB's pg_dist*
tables store metadata. When using pg_shard
with PostgreSQL, it uses its own tables.
That means the only possibility is stale metadata (i.e. from caching). So we'll want to address that.
@ozgune what do you mean "migration scenario"? I can see:
pg_shard
with PostgreSQL and upgrading your underlying DB to CitusDBpg_shard
and wanting to add it for INSERT
spg_shard
to create new distributed tables within CitusDB and have it know how to query themAm I missing any other possibilities?
@jasonmp85 Yup, I was curious more about the use-cases, where we needed to sync metadata between pg_shard and CitusDB.
I'm wonder if we can simplify the problem down to (pg_shard -> CitusDB). If we could and wanted to, could we then put a trigger on the pg_shard metadata to propagate the update to CitusDB system catalogs?
Let's stop using the word sync. There is no syncing here.
To clarify, everything is always in sync… there is no sync step. pg_shard
will directly write to CitusDB metadata tables when installed within CitusDB, and the right things are in place to permit a migration from PostgreSQL + pg_shard
to CitusDB using pg_dump
to write out a pg_shard
master and turn it into a CitusDB master.
By the way, the migration step is to write out all normal tables except the pg_shard
metadata using pg_dump -N pgs_distribution_metadata
. You apply that to a fresh CitusDB install that will be your master .Then you dump the pg_shard
metadata with pg_dump --inserts -t 'pgs_distribution_metadata.*'
and apply that to the new CitusDB master. From that point onwards all previous pg_shard
tables will be seen by CitusDB, and all pg_shard
and CitusDB metadata will be identical.
If you're using pg_shard
within CitusDB from square one, it always remains in sync: it sees all CitusDB tables, and CitusDB sees all pg_shard
metadata.
I'm still figuring out how I want to test the CitusDB-specific parts of this, but the main code is complete.
Code rebased on latest develop, tests fixed, and changes from feedback made (denoted by (done) in one of my replies).
@sumedhpathak, can you see what else needs attention on this PR?
OK, so this has been entirely rebased on top of the changes from #110, i.e. we're using SPI everywhere now. This review is now all about the triggers and views needed to adapt CitusDB's metadata tables to be used within
pg_shard
.Travis is building this under PostgreSQL 9.3/9.4, and CitusDB 4.0, so there is some assurance this actually works.
Code Review Tasks
pg_shard
tables into views when Citus present (to prevent them from appearing blank), or just error out when queried.pg_shard
to CitusDB. Users mightpg_dump
from a PostgreSQLpg_shard
install and try to load into CitusDB. This must work seamlessly.\STAGE
and using an existing CitusDB-distributed table withpg_shard
Resolves #11. Resolves #27.