Segfault-Inc / Multicorn

Data Access Library
https://multicorn.org/
PostgreSQL License
700 stars 145 forks source link

Front end to many Postgres dbs? #107

Closed johnmudd closed 9 years ago

johnmudd commented 9 years ago

I'm currently using Multicorn at many sites. I'm very happy with it. Thanks again for Multicorn!

Today I'm wondering if it's useful to use Multicorn to build a central Postgres interface to all of my individual Postgres DBs. So I can enter a query in the central DB, have it handed out to thousands of remote DBs, collect the results and return the results from the central Postgres. One difference is that results from the central DB would offer an extra "Source DB Name" column for any returned row. That column would tag the row with the name of the source db.

How does that sound in general? Is there anything that makes such an approach impractical? Also, am I reinventing the wheel? Is there already an existing solution for this problem?

Edit: I guess the first problem is that Multicorn doesn't have access to the raw SQL. So it couldn't farm out the original query. It could distribute the qual list but that's not selective enough in my opinion. It would create too much network traffic for intermediate results and result in a bottleneck of processing on the central Multicorn.

rotten commented 9 years ago

You are thinking of an architecture that involves a "dataless database" which consists exclusively of foreign data wrappers to other, various sorts of, databases. That should work. You can use views to union the data from several similar tables and then append a column with the source database in it. You shouldn't need custom foreign data wrapper code that appends columns the user didn't ask for explicitly in the query.

rdunklau commented 9 years ago

There is already a system colum that you can use for that purpose, tableoid. From the tableoid, you can get the tablename, or any information from the system catalogs. That way, your row is already "tagged", for free:

select *, tableoid, tableoid::regclass from some_table;

Please note that from 9.5 onwards, inheritance will be supported for foreign tables. That mean that in theory you could use every classical inheritance / partitioning scheme with foreign tables, including constraint exclusion. This will be better in almost every aspect than manually building a view on top of the tables.

In fact, I guess this feature has been implemented exactly for the kind of use case you mention: sharding with foreign data wrappers. I've not tried it yet, so I'll be interested in whatever feedback you can provide (even more so if that means that additional work should be done in Multicorn to support it).

johnmudd commented 9 years ago

Thanks! There's a brief description of the new inheritance in this PDF listing PostgreSQL 9.5 New Features. In the section "3.4.5 CREATE FOREIGN TABLE INHERITS statement".

I also realize now that I don't have to start at such grand scale with thousands of dbs. Some of our stores are chains so I can try to apply this for a small chain with just a few dbs. I will update here when I get time to work on this.