citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.57k stars 670 forks source link

Re-consider the foreign table implementation via Citus Local Tables or Reference tables #5299

Closed onderkalaci closed 2 years ago

onderkalaci commented 3 years ago

Citus' foreign table implementation is heavily aimed to work with cstore_fdw, where each shard is a foreign table. In fact, this approach doesn't make much sense for many of the foreign data wrappers, such as postgres_fdw. As we rolled out columnar as part of Citus 10.0, we can drop support for cstore_fdw, and implement the foreign tables in a more natural way.

An approach could be to use Citus local tables for foreign tables. These foreign tables can be accessed in the CitusMX workers as well. The coordinator (hence the shard) would have access to the user mapping. So, we don't even need to propagate that to the workers. The queries executed on the w

Marco already has a prototype for that: https://github.com/citusdata/citus/compare/marcocitus/fix-foreign-table?expand=1

Though there are certain things like undistribute_table does not work in the prototype.

Another approach could be to use foreign reference tables, which can be more performant when queries from MX workers and could be more efficient when joined with distributed tables

marcocitus commented 3 years ago

Marco already has a prototype for that: marcocitus?expand=1 (compare)

I also tried adding truncate support, but that fails due the need to take locks on multiple nodes (in MX mode), which triggers a 2PC, but PREPARE TRANSACTION fails after writes to postgres_fdw. I think we can leave truncate out-of-scope for now.