Open jensneuse opened 6 years ago
@jensneuse Thanks for the feature request. We are currently planning for the next release, so I'll update this issue once we decide how to prioritize this.
That being said, is there a reason why you don't want to just use CockroachDB for both of these use cases? Secondly, what is your opinion on supporting some form of streaming replication set up?
@dianasaur323 Thanks for getting back. I'm looking forward to hearing from you about your release planning.
In regard to your questions, yes there are reasons why we don't want to use crdb as source of truth. Our CMS requires some complex querying (e.g. full text search) on a few million rows. We try to not rely on non transactional db systems like Elasticsearch. I saw that someone mentioned fulltext search support could be implemented using bleve but I guess this features is not prioritised currently.
Secondly, I don't think streaming replication would help in this particular scenario. We're going to heavily denormalise data before storing in crdb. Crdb seems perfect for this use case because we can update "denormalised published content" across many rows in transactions. That being said, I'd be super interested in some form of "change replication to message queue" in a distributed fashion. So to say, as crdb is masterless, every node could stream its changes to a message queue like NATS to achieve near realtime updates without relying on pg_notify e.g. .
@jensneuse Got it, thanks for your feedback on foreign data wrappers. I haven't spent too much time looking into it yet, although I've been told by others that this could also help with users trying to migrate from other databases to CockroachDB as well. I'll look into it further. For my knowledge, how large do you anticipate your transactions to be when updating CRDB with denormalized data (how many rows do you expect to insert / update)?
@jensneuse we already support timezone
and datestyle
- we're missing only intervalstyle
. It should be pretty straightforward to add this, but I wouldn't be surprised if there's more work to do after that is added.
@jordanlewis You are right but this doesn't resolve the problem. The problem is the syntax. Crdb expects 'time zone' whereas postgres fdw tries to connect using 'timezone'. Maybe you're right and the complex part begins when these sql statements go through.
[2017-11-26 16:04:27] [XX000] ERROR: unknown variable: "timezone" [2017-11-26 16:04:27] Where: Remote SQL command: SET timezone = 'UTC'
@dianasaur323 I expect to batch update a few hundred rows.
We'll probably have to write our own foreign data wrapper. The default Postgres one uses PL/pgSQL cursors (DECLARE CURSOR FOR
) which we won't support anytime soon.
Although @jensneuse if you don't need SELECT
on the remote server it should work okay.
@jensneuse Gotcha, that size of update shouldn't be a problem. Thanks!
Any opposition to adding the capability to do SET timezone = ...
?
This comes from yet another slight incompatibility I've found while testing ORMs. While it's fine to fix the ORM adapters, this will come up elsewhere where have yet to test or support so I think it would be better to add it.
Ah, disregard that. It looks like the current alpha already supports SET timezone = <...>
. In 1.1.x only SET "time zone" = <...>
or SET TIME ZONE <...>
are supported.
I think that FDWs are not the proper approach for this. Instead we probably want to look at #26780.
Is there any plans for this? I think this requires declare cursor to work. I'd also be insterested using postgres_fdw to talk to cockroachdb.
Are transactions truly a blocker? When editing the data some more latency might be acceptable, while with Change Data Capture feeds you get quite reliable global replication and including the RESOLVED
feature, you can be sure the data is up to date up to a certain timestamp.
Google Zanzibar also uses something similar, in which each document stores a timestamp of last edit of the permission model, which allows for local consistency between the document and the rules that hold for that document, without requiring headaches for global replication. Wouldn't this work for a CMS too?
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
We'd like to build a large scale CMS with postgresql as source of truth + cockroachdb as denormalized data store for public access.
To keep both systems in sync via transactions I'd like to use a foreign data wrapper.
As per the documentation of postgresql foreign data wrapper module postgresql executes a few commands on client connection startup such as 'set timezone' etc.. (see documentation: https://www.postgresql.org/docs/9.4/static/postgres-fdw.html => F.32.5.)
These sql commands are currently not supported by cockroachdb. A possible workaround could be to use pg_notify to inform an external application to sync changes. However we'd lose transactions this way which is a blocker.
Jira issue: CRDB-5937
gz#20998