citusdata / citus_docs

Documentation for Citus. Distributed PostgreSQL as an extension.
Creative Commons Attribution 4.0 International
58 stars 59 forks source link

Document how to replicate reference tables to coordinator #1053

Open crabhi opened 2 years ago

crabhi commented 2 years ago

Hi, we're preparing a POC to migrate an existing Postgres database to Citus for further scaling. We have a few tables we designated reference. They're also used together with many tables that will stay local to the coordinator. Some of the queries are very inefficient if they have to run in the distributed mode.

Therefore, I wondered why it wasn't possible to keep one reference table copy at the coordinator. I stumbled upon citusdata/citus#1615 and citusdata/citus#3155. Calling master_add_node indeed seems to solve our issues but I can't find any documentation about that function since v6. If I understand it correctly, it's an alias for citus_add_node, right?

It surprised me this behaviour of reference tables wasn't the default. Are there any drawbacks to calling citus_add_node with the coordinator address and groupId => 0 apart from the reference tables storage requirements? It would be great if this use case was documented.

onderkalaci commented 2 years ago

f I understand it correctly, it's an alias for citus_add_node, right?

Yes, we are deprecating APIs starting with master_, so suggest using citus_add_node(..., groupid:=0) or citus_set_coordinator_host: https://docs.citusdata.com/en/v11.0/develop/api_udf.html#set-coordinator-host

I agree that this could be documented somewhere here: https://docs.citusdata.com/en/v11.0/develop/reference_ddl.html?highlight=reference%20table#reference-tables

It surprised me this behaviour of reference tables wasn't the default.

We plan to do soon: https://github.com/citusdata/citus/pull/5756

Moving the issue to Citus docs for visibility.

crabhi commented 2 years ago

Oh, great, thanks!